A COURSE OF 


MATHEMATICS 


V. |. Smirnov 





INTEGRATION 
AND 
FUNCTIONAL 
ANALYSIS 


A COURSE OF 
Higher Mathematics 


VOLUME V 


V. I. SMIRNOV 


Translated by 
D. E. BROWN 


Translation edited by 


I. N. SNEDDON 


Simson Professor in Mathematics 
University of Glasgow 


PERGAMON PRESS 


OXFORD- LONDON EDINBURGH- NEW YORK 
PARIS. FRANKFURT 


1964 


PERGAMON PRESS LTD. 
Headington Hill Hall, Oxford 
4& 5 Fitzroy Square, London W. 1 


PERGAMON PRESS SCOTLAND) LTD. 
2 & 3 Teviot Place, Edinburgh 1 


PERGAMON PRESS INC. 
122 East 55th Street, New York 22, N. Y. 


GAUTHIER-VILLARS ED. 
55. Quai des Grands- Augustins, Paris 6 


PERGAMON PRESS G.m.b.H. 
Kaiserstrasse 75, Frankfurt am Main 


U.S.A. edition distributed by 


ADDISON-WESLEY PUBLISHING COMPANY INC. 
Reading, Massachusette . Palo Alto - London 


Copyright ©) 1964 
PERGAMON PRESS LTD. 


Library of Congress Catalog Card Number 63-10134 


This translation has been made from the Russian Edition of V. I. Smirnov’s 
book Kype estcuett mamemamuxu (Kurs vysshei matematiki), published in 1960 
by Fizmatgiz, Moscow 

MADE IN GREAT BRITAIN 


INTRODUCTION 


Tus is the final volume of Prof. Smirnov’s five-volume course of 
higher mathematics, about whose history some remarks were made 
in the Introduction to Vol. I of the present English edition. 

The first Russian edition of this volume, published in 1947, enjoyed 
the distinction of being the first book in any language on the theory 
of integration and the elements of functional analysis to be written 
specifically with the needs of theoretical physicists in mind. Indeed 
nearly twenty years after its publication its only rivals would appear 
to be works by other Russian authors. 

Functional analysis arose as the result of generalizing various con- 
cepts and methods of classical branches of mathematics. Although it 
has become (in the manner characteristic of contemporary mathe- 
matics) a very abstract discipline, its general results can be used to 
derive the solution of particular problems in classical analysis and in 
applied mathematics. Its successes have been such that it is difficult 
to imagine that a strong light cannot be cast on the solution of almost 
any problem in mathematical analysis by the use of the concepts and 
techniques of functional analysis. Large areas of the modern theories 
of approximation, differential equations and mathematical physics 
are dominated by these methods and so research workers in physics 
and engineering need to become familiar with the ideas of functional 
analysis. They will find a clear and authoritative introduction to 
these topics in this volume, but it should not be regarded as of use 
to them only; students of pure mathematics will find here an account 
not only of the essentials of a flourishing branch of modern pure 
mathematics but also of its links with the past and of the motivation 
of much of the recent abstract work in the subject. 


I. N. SNEDDON 


PREFACE 


IN MODERN theoretical treatments of mathematical physics great 
importance attaches to the theory of functions of a real variable, 
the various functional spaces and the general theory of operators. 
These subjects provide the essential material for the present book, 
which is based on the fifth volume of my Course of Higher Mathe- 
matics, published in 1947. 

The branches of the theory of functions of a real variable in the 
present book include the theory of the classical Stieltjes integral, the 
Lebesgue-Stieltjes integral and the theory of completely additive 
set functions. 

The first chapter discusses the theory of the classical Stieltjes 
integral, and also considers the more general definition of the Stieltjes 
integral over an interval of any type, based on the equality of the 
upper and lower Darboux integrals with a subdivision of the basic 
interval into intervals of any type. The Fourier—Stieltjes and Cauchy- 
Stieltjes integrals are taken as examples of the classical Stieltjes 
integral, and inversion formulae are established for these. The Stieltjes 
integral is also defined for the plane case. 

The space C of continuous functions is also discussed in Chapter I, 
and the general form of linear functionals in this space is established. 

The second chapter deals with the foundations of the metric theory 
of functions of a real variable and the Lebesgue-Stieltjes integral. 
The whole of the theory is expounded for the case of a plane and the 
possibility of its obvious generalization to the case of n-dimensional 
Euclidean space is indicated. The theory of measure is built up on 
the basis of any non-negative, additive, normal function, defined on 
semi-open two-dimensional intervals. The Lebesgue-—Stieltjes integral 
of a bounded function is defined on the basis of the coincidence of 
the upper and lower Darboux integrals when the basic measurable 
set is subdivided into measurable sets. Chapter II ends with a detailed 
discussion of an averaging process for functions and the properties 
of the mean functions, when the averaging kernel is subject to certain 
conditions. Wide use is subsequently made of the averaging process. 


xi 


xii PREFACE 


The third chapter deals with the theory of completely additive 
set functions. After proving the initial theorems, the theorem on 
the decomposition of a completely additive set function into a singular 
and an absolutely continuous part is stated without proof, and the 
fundamental facts relating to this decomposition are discussed. The 
case of a single independent variable is treated in detail. Also, an 
absolutely continuous set function is studied in the general case, 
and the formula established for changing the variables in a multi- 
dimensional Lebesgue—Stieltjes integral. 

The third chapter ends with a proof of the above-mentioned theorem 
on decomposing a completely additive set function into two terms. 
Furthermore, the concept of Hellinger integral is introduced in the 
multi-dimensional case, and its properties are investigated. In particu- 
lar, the connection is established between the Hellinger integral and 
the Lebesgue-Stieltjes integral. The case of the one-dimensional 
Hellinger integral is analyzed in detail. All the proofs at the end of 
Chapter III are based on a preliminary detailed treatment of the 
properties of completely additive set functions [78, 79]. 

The fourth chapter contains an exposition of the foundations of 
the general theory of metric and normed spaces. It ends with a 
detailed discussion of generalized derivatives, embedding theorems 
for the various function spaces, and the theory of functionals in 
the space of continuously differentiable functions. All these questions 
are related to S. L. Sobolev’s well-known investigations. They are 
dealt with in his monograph Some Applications Of Functional Analysis 
To Mathematical Physics (Nekotorye primeneniya funktsional’nogo 
analiza v matematicheskoi fizike) (1950). 

Generalized derivatives are defined in two ways — with the aid of 
the formula for integration by parts and by means of the closure 
of functions with continuous derivatives; the equivalence of these 
definitions is proved. Special attention is paid to the case of a star- 
shaped domain. Furthermore, the complete normed functional spaces 
WD) and wD) are introduced; the first of these consists of the 
functions g(x) that are defined in the domain D and have all generalized 
derivatives of order J, where p(x) and the derivatives in question 
belong to Z,(D), whilst the second space consists of the functions 
g(x) that have all generalized derivatives up to and including order 1. 
It is subsequently proved that, for a wide class of domains D, OD) 
and W?(D) consist of the same set of functions, and that the norms 
introduced into them are equivalent. Moreover, fairly simple proofs 


PREFACE xiii 


are given for space WD) of theorems that are particular cases of 
the embedding theorems for WD). 

These theorems are first formulated, then a complete proof 
of them is given in fine print, on the basis of Sobolev’s integral 
form. All this material is closely related to the above-mentioned 
monograph. 

The final fifth chapter deals with the general theory of Hilbert 
space, the whole of the treatment being first given for the case of 
bounded operators. Fredholm’s theorems are proved for linear 
equations with completely continuous operators. They have been 
stated without proof for normed spaces. 

The relevant integral forms in terms of the differential solutions 
are given with the aid of Hellinger integrals for self-conjugate operators 
on a continuous spectrum. Examples are given of the application of 
the general theory of bounded operators in J, and J,. 

The final section of the fifth chapter is devoted to the theory of 
unbounded operators in Hilbert space. After proving the general 
theorems, numerous examples are given of differential operators with 
one and several independent variables. The general theory of extension 
of closed symmetric operators is followed by a discussion of the 
special case of semi-bounded operators, and in particular, of their 
Friedrichs extensions. 

The publication of a sixth volume is envisaged, dealing with certain 
problems of the modern theory of differential operators with one and 
several independent variables. 

In addition to specialized articles, I have made use of numerous 
books in preparing the present volume. The chief titles are as follows: 
V. I. Glivenko, The Stieltjes Integral (Integral Stilt’esa); I. P. Natanson, 
Theorie der Funktionen einer reellen V ertinderlichen; Saks, Theory Of The 
Integral (Teoriya integrala); de la Vallée-Poussin, Integrales de Lebesgue. 
Fonctions d'ensembles. Classes de Baire ; Stone, Linear Transformations 
in Hilbert Space and their Applications to Analysis; N. I. Akhiezer 
and I. M. Glazman, Theory of Linear Operators (Teoriya lineinykh 
operatorov); A. I. Plesner, Spectral Theory of Linear Operators, I 
(Spektral’naya teoriya lineinykh operatorov, I) (Uspekhi matema- 
ticheskhikh nauk, t. IX, 1941); N. J. Akhiezer, Infinite Jacobian 
Matrices and the Problem of Moments (Beskonechneye matritsy 
Jakobi i problema momentov) (loc. cit.); S. L. Sobolev, Some Applicat- 
ions of Functional Analysis to Mathematical Physics (Nekotorye 
primeneniya funktsional’nogo analiza v matematicheskoi fizike). 


xiv PREFACE 


I want to thank S. M. Lozinskii for reading the original manuscript 
and making a number of valuable suggestions. 

The treatment of numerous problems in the second part of this 
book is due to Prof. O. A. Ladyzhenskaya, who is the associate 
author of the second part. I discussed in detail with her the plan of 
this book. 

M. S. Birman gave great assistance in preparing the second part 
of the book. He is responsible for the exposition of the sections 
dealing with embedding theorems [114-118] and with the theory of 
small perturbations of the spectrum [198]. He gave valuable advice 
on the spectra of symmetric operators and their extensions, as also 
on the treatment in Chapter IV. 

Let me express my indebtedness to O. A. Ladyzhenskaya and 
M. S. Birman. Without their help I should not have been able to 
carry the work through to the end. 

The first three chapters were read by G. P. Akilov, from whom 
I obtained a number of valuable suggestions regarding the treatment 
of certain problems. I tender him my sincere thanks. 


V. SMIRNOV 


CHAPTER I 


THE STIELTJES INTEGRAL 


1. Sets and their powers. The various concepts of integral play a 
large part in the application of mathematical analysis to present-day 
science, and we shall discuss in our first two chapters the theory of 
integration in a more general form than previously. As a preliminary, 
the present section contains a certain amount of elementary set 
theory, which is supplementary to that given in [IV; 16]. 

Suppose we have two sets A, and A,, consisting of objects of any 
type (elements). The sets are said to have the same power if a one-to- 
one correspondence can be established between the elements of A, 
and the elements of A,, i.e. a correspondence in which a definite 
element of A, is associated with each element of A,, and conversely, 
each element of A, is associated with one and only one element of A,. 
An infinite set (i.e. a set containing an infinite number of elements) 
is described as denumerable if it has the same power as the set of 
all positive integers, i.e. if its elements can be enumerated by means 
of positive integers: a,, a,,a3,... Two denumerable sets have the 
same power. Let us examine some properties of denumerable sets. 
We consider the part of a denumerable set containing an infinite set 
of elements ap, @p,,..., where p,, Pa ... is an increasing sequence 
of positive integers. The elements of this new set are also numbered. 
The number of each element is the subscript of p. In other words, they 
are numbered in order of increasing subscripts p,, p,, ..-. An infinite 
part of a denumerable set is therefore a denumerable set. We now 
take two denumerable sets: A(a@,, a, a3, ...), consisting of elements 
Ai, do, Ay, ... and B(d,, bz, bz, ...), consisting of elements 6,, bz, by, ...; 
we form their sum, i.e. we combine the elements of both sets into a 
single set C. The new set C thus obtained is generally called the sum 
of sets A and B. This new set is also denumerable. For we only need 
to arrange the elements of set O say in the following order: a,, b,, 
Qz, bz, ..., in order to see that C is denumerable. If there are identical 
elements a;, bı, we have to take one of them and strike out the re- 
mainder. A similar argument applies for the sum of a finite number of 


1 


2 THE STIELTJES INTEGRAL [1 


denumerable sets, i.e. the sum of a finite number of denumerable sets 
is a denumerable set. 

Suppose we have a denumerable set of denumerable sets. The 
elements of all these sets can be denoted by a letter with two integral 
indices at. The upper index indicates the number of the set to which 
the element belongs, and the lower the number which the element 
has in the denumerable set to which it belongs. There is no difficulty 
in enumerating all the elements a‘. We take as the first element 
the one in which both indices are unity: a{}?. We then take the elements 
in which the sum of the indices is 3, and arrange them in order of 
increasing upper index. We thus obtain a$}, a® as the second and 
third elements of the sum of sets. We now take the elements in which 
the sum of the indices is 4, and arrange them in order of increasing 
upper index: af, af”, a®, This gives the fourth, fifth and sixth 
elements of the sum of sets. It may be seen on continuing this con- 
struction that the sum of a denumerable number of denumerable 
sets is a denumerable set. This assertion would obviously still hold 
if certain of the component sets were finite instead of denume- 
rable. 

Let A be an infinite set. We choose any element of it and assign 
it the number one. The remainder of the set will be infinite, as before. 
We choose any element from it and assign it the number 2. On proceed- 
ing in this way, it will be seen that a denumerable set can be ex- 
tracted from any infinite set. The set remaining after such extraction 
may be either empty, i.e. contain no element at all, or may be finite, 
or infinite. Let us show that, if this remaining set is infinite, it has 
the same power as the original set, i.e. the following assertion holds: 
if, after extracting a denumerable set P from an infinite set A, an 
infinite set B remains, sets A and B have the same power. We extract 
from the infinite set B a further denumerable set Q, and let C be 
the remaining set. The original set A is now split into three sets: A= 
= P +- Q 4- 0, of which the set C may be empty or may be infinite, 
whilst sets P and Q are denumerable sets. We had A = P + B 
prior to the second extraction. A one-to-one correspondence is readily 
established between the elements of A and B; for we have A = 
= P +Q +C and B =Q + C. The sum P + Q of denumerable sets 
is a denumerable set, so that a one-to-one correspondence can be 
established between the elements of P + Q and Q. We put every 
element of the set C in correspondence with itself. A one-to-one 
correspondence will thus be established between the elements of A 


1] SETS AND THEIR POWERS- 3 


and B. A direct consequence of the assertion just proved is that, 
if a denumerable set is added to an infinite set, the new set obtained 
will have the same power as the original set. Both the assertions 
regarding the subtraction and addition of a denumerable set remain 
in force if the denumerable set is replaced by a finite set. The proof 
is precisely the same as above. 

We mentioned earlier [IV; 15] that either the set of rational numbers 
belonging to an interval [a,b], or the set of all rational numbers, 
is denumerable. This is proved in essentially the same way as the 
statement that the sum of a denumerable number of denumerable 
sets is denumerable. The role of upper index is played by the numerator 
of the fraction, and the role of lower index by the denominator; 
it is necessary to start by considering positive fractions. Let us 
now adduce an example of a non-denumerable set. We take all the 
real numbers belonging to the interval [0,1]. We can write each of 
them, apart from zero, as an infinite decimal fraction with integral 
part equal to zero, and conversely, every such decimal fraction will 
correspond to a real number of our interval. We do not make use 
of finite fractions, since a finite fraction yields the same number 
as an infinite fraction having a 9 recurring, e.g. 0.37 = 0.36999.... 
Let us show that the set of these real numbers is non-denumerable. 
We use reductio ad absurdum. Suppose that all our decimal fractions, 
including the fraction 0.00. .., giving the left-hand end of the interval, 
can be enumerated. A new decimal fraction, with an integral part 
equal to zero, may be formed as follows. As the first figure after 
the decimal point we take a number different from the first figure 
of the first of the enumerated decimal fractions, as the second figure 
we take some number different from the second figure of the second 
of the enumerated decimal fractions, and so on. An infinite decimal 
fraction is obtained (we make no use of the figure 0 in forming the 
figures in the new decimal fraction), which differs from all the enu- 
merated fractions. Hence the real number corresponding to it is not 
enumerated, which contradicts the fact that all the real numbers of 
the interval [0,1] are enumerated. We have thus shown that the 
set of all the real numbers belonging to the interval [0,1] is non- 
denumerable. This set is said to have the power of a continuum. 
It may easily be seen that the set of the real numbers belonging to 
any finite interval [a, b] has the same power as the set of real numbers 
belonging to the interval [0, 1]. A one-to-one correspondence between 
the elements of these sets is established by means of the formula 


4 ‘THE STIELTJES INTEGRAL [2 


y = (x — a)j(x — b). When x runs through the interval [a,b], the 
variable y runs through the interval [0,1]. If we use the formula 
y = tan (x x — z2), when x varies inside the interval [0, 1], y runs 
through the set of all real numbers, i.e. the set of all real numbers 
also has the power of a continuum. If the ends of the interval are 
not included in the set, this does not change its power, inasmuch as 
the subtraction or addition of a finite set from or to an infinite set 
does not change the power of the infinite set. 

We shall in future write [a,b] for a closed interval and (a, b) for 
an open interval, i.e. an interval from which the ends are excluded. 
If the left-hand end is excluded and the right-hand included, we 
use the symbol (a, b], and similarly for [a, b). The numbers a and b 
may take infinite values: a = —°co and b = +œ, i.e. the intervals 
discussed may be infinite on the left or right. For example, the closed 
interval [—°o, +°°] contains both the infinitely remote elements. 
Correspondingly, the function f(x) may be defined for v = —° and 
z = +œ, and we can write e.g. f(—Ħœ). Continuity at x = — is 
equivalent to the condition lim f(x) =/(—°°). Similarly for v—> +-°. 


X>- 


Furthermore, the usual notations may be used: lim f(x) = 
= f(—oo + 0) and lim f(x) = f(+% — 0). a 


g—> +œ 
It is easily shown {I; 43] that a function f(x), finite and continuous 
in the closed interval [—°co, + °°], is uniformly continuous in this 
interval. 


2. The Stieltjes integral and its basic properties. Let us recall the 
definition of Riemann integral, of which use has generally been 
made in the previous volumes. Let [a, b] be a finite interval and f(x) 
a bounded function, given in this interval. We subdivide the interval: 
a=%<%y<... < Up, < Tr = b, choose a point & in each sub- 
interval [kı Tk] and form the sum of products: 


o= È NE (Ek — 24). (1) 


If this sum has a finite limit A for any choice of points &, as the 
subdivision becomes indefinitely finer, this limit is in fact called the 
integral of f(x) over the interval [a, b]. Let ô be the greatest of the 
differences £y — 2,-,. An indefinitely fine subdivision of [a,b] is 
equivalent to the fact that 6— 0, and the existence of the finite 
limit A for the sum (1) is equivalent to the following: given any 


Qı 


2] THE STIELTJES INTEGRAL AND ITS BASIC PROPERTIES 


positive £, there exists a positive 7 such that 


A — & Hn) (2% — tra) <E for ô <7. 


A more general integral can be constructed in essentially the same 
way. It was first introduced by the Dutch mathematician Stieltjes 
in 1894, in his studies on continuous fractions, then was widely devel- 
oped and applied both in pure and applied mathematics. Let f(x) 
and g(x) be two functions given in the finite interval [a,b], at every 
point of which they take finite values. Instead of the sum (1), we 
form the sum 


c= 2 FE) LIE) — gl£r—1)] . (2) 


We shall call this a Riemann-Stieltjes sum. If it tends to a definite 
finite limit for any choice of points x when the sub-division becomes 
indefinitely finer, f(x) is said to be integrable with respect to the 
function g(x) in the interval [a,b], and we write 


b n 
§ f(a) dg(a) = lim & ëh (ole) — (1 - 


In the Riemann integral, the role of g(x) is played by x. The new 
integral evidently has many properties similar to the Riemann integral, 
and the proofs of these properties are precisely the same as for the 
Riemann integral. We give these properties on the assumption that 
all the integrals in the formulae below exist: 


b p P b ì 
J È ox fde) dge) = Z a | fal) Agta); 


p p b 
f(æ)d 2 ar gl£) = 2 ak f f(x) dg,(x); 7 (a, — are constants). (3) 


Re 
© (> 


c b 
J fle) dgla) = J fie) Agta) + J fæ) g(a). 


a 


We have further the obvious equation: 


b 
J dg(e) = gb) — gla). (4) 


In the first and second of formulae (3), the existence of the integral 
on the left follows from the existence of the integrals on the right. 


6 THE STIELTJES INTEGRAL {2 


Let us consider the proof of the formula for integration by parts. 
Let the integral of g(x) with respect to f(x) exist; we show that the 
integral of f(x) with respect to g(x) now exists. We transform the 
sum (2) by collecting the terms containing the values of g(x) at 
coincident points: 


n-lI 


o= — È I(t) Uf (E41) — FE] + 9(8) HEn) — gla) f(x) - 


On adding and subtracting the difference 
CA(a) g(x) Ta = F(b) gb) — f(a) gla) , 


we can write 


o = (f(a) gle) -fS oe ) Mléka) — HEDI + 


+ gla) (Ex) — Fla) ] + gib) [F — nen] . (5) 


The braces contain the Riemann-Stieltjes sum (2) for the integral of 
g(x) with respect to f(x). By hypothesis, the integral of g(x) with 
respect to f(x) exists, i.e. the expression in the braces tends to this 
integral on indefinite subdivision of the interval. Hence, by (5), the 
sum o has a limit, i.e. the integral of f(x) with respect to g(x) exists, 
and we can write the formula for integration by parts: 


b b 
i fæ) dg(æ) = (f(a) g(x)]a — f gle) djia) (6) 


or 
b 


b b 
d f(x) dg(x) + g(x) Af(x) = [f(x) g(x) ]h, (7) 


where the existence of one of the integrals written implies the existence 
of the other. 

Two particular cases of the Stieltjes integral must be mentioned. 
Suppose that the interval [a, b] is subdivided into a finite number 
of parts: a = co < &4 < ... < Cpm < cp = b, and that g(x) has a 
constant value gą inside each of the sub-intervals (c,—,, cx). Thus 
g(x) has a jump Sk = 9x4, — gk at every point cy lying inside the 
interval [a, b]. Jumps are also possible at the ends of the interval: 
the jump sọ =g, — g(a) at the left-hand end and sp = g(b) — gp 
at the right-hand end. Suppose further that f(x) is continuous at 
all the points of subdivision c, and at the ends of the interval. Let 


2] THE STIELTJES INTEGRAL AND ITS BASIC PROPERTIES 7 


c be points which are not points of subdivision, excepting possibly 
co and cp. In the sum (2), all the terms in which 2,_, and z, lie inside 
the same interval (cj_,, cg) will vanish, since in this case g(a,_,) = 
= g(x). If the interval [£k £4] contains a point of discontinuity 
Cy f(x) will tend to f(c,) on indefinite subdivision, and g(x) — g(x) 
to s, and it is immediately evident that (2) gives in the limit the 
following finite sum: 


tim 5 10 g(x) ) = gla) = Š fe (c) Sq- (8) 


If c is a point of subdivision of [a,b], we have to consider both 
the intervals having c, as an end, and the result is found to be the 
same. We now take a second particular case. Let f(x) and g(x) be 
continuous in [a, b] and let g(x) have a derivative g'(x) inside [a, b], 
which is Riemann integrable and therefore bounded. On applying 
Lagrange’s formula to the difference g(x,) — g(z,-,), we can write the 
sum (2) ee 


> Kee) ) Lg(@x) — g(%x-1)] =È HE) 9'( g (4) (Ek — Lk)» (9) 


where é;, is an interior point of [a,_,, x]. We can put f(&,) = f(£%) + 
+ £k, where, by virtue of the uniform continuity of f(x) in [a,b], 
the greatest of the | «,| tends to zero on indefinite subdivision, i.e. 
given any positive £, there exists a positive 1 such that | x| < e if 
ô <n. We can rewrite the sum (9) as 


PA [9(%x) — 9(%-1)] = 


= = FE) g (Fi) (Ek — r1) + a €x9' (Ei) (Ek — Tea). (94) 


The product of two Riemann integrable functions is also integrable 
[I; 117], and the first term on the right of (9,) tends, on indefinitely 
finer subdivision, to the Riemann integral of f(x) g(x). It may easily 
be shown that the second term tends to zero. In fact, the function 
g'(x) is bounded, as mentioned above, ie. | g’(x)| < M, where M 
is a definite positive number. As we have said, given a positive 6, 
there exists a positive 7 such that | «| < e for ô < n, and we have 
the inequality: 


Sag (Ek) (£k — Zr) J| < Š Mer — a 1) =M (b — a), 


8 THE STIELTJES INTEGRAL [3 


from which it follows, since ¢ is arbitrary, that the second term on 
the right-hand side of (9,) tends to zero. We therefore have in the 
limit: 

b 


b b 
d f(x) dg(x) = J f(x) g(x) de, (10) 


i.e. given our assumptions, the Stieltjes integral reduces to an ordinary 
Riemann integral. In the previous case, it degenerated to a finite 
sum. It may easily be shown that (10) still holds if we require that 
/(z) be Riemann integrable instead of continuous. We shall consider 
later the question of the existence of the Stieltjes integral as defined 
above, and of certain more general integrals, to be defined in due 
course. An essential fact in all this will be that the function g(x) is 
assumed non-decreasing in (a, b]. 

In future, we shall often describe a non-decreasing function as increas- 
ing. The maximum of such a function is g(b), and its minimum g(a). 
The following section is of a preparatory nature. It is of fundamental 
importance, not only for investigating the existence of the Stieltjes 
integral as defined above, but also for studying the problem of the 
existence of the more general integrals that we are to introduce 
later. 


3. Darboux sums. When discussing the Riemann integral, we 
brought in the so-called Darboux sum. Analogous sums will play a 
basic role in all the generalized integrals to be introduced below. 
We shall construct these sums in the present section and investigate 
their properties for the case of the Stieltjes integral. All the concepts 
introduced in this section, and all the facts proved, will be repeated 
with certain minor modifications in regard to future generalized 
types of integral, and we shall often refer back to the present 
results. 

Let us first of all recall the definition of the strict bounds of the 
set of real numbers [I; 39]. Let Z be a set of real numbers, and let it 
be bounded from above, i.e. there exists a number L such that all 
the numbers of the set are less than L. There now exists a definite 
number M with the following property: every number of the set g 
is not greater than M, but, given any positive e, there are numbers 
of Z which are greater than M — e. This number M is called the 
strict upper bound of the set Z. Similarly, if the set is bounded from 
below, i.e. if all the numbers of the set are greater than some definite 


3] DARBOUX SUMS 9 


number, the set has a strict lower bound m, which has the following 
property: every number of the set is not less than m, but, given 
any positive £, there are numbers of & which are less than m + e. 
If the set is unbounded from above, its strict upper bound is said 
to be (4-20), and similarly, if it is unbounded from below, its strict 
lower bound is said to be (—oc). The following notation is used for 
the strict bounds: 


m=inff and M = supë. 
Let f(x) and g(x) be functions bounded in the interval (a, b], which 
may be finite or infinite, g(x) being a non-decreasing function, and let 
a = To < Li < La L. L Lng L Ea = 


be a subdivision of [a,b] which we write symbolically as ô. In the 
case of an interval infinite on the left, a = —oo, and for an interval 
infinite on the right, b = +œ. Further let mg and M, be the strict 
lower and strict upper bounds of f(x) in the sub-interval [x,~,, 2,]. 
We form the following Stieltjes-Darboux sum, corresponding to the 
subdivision ô of [a, b]: 


s= È m [9(%%) — 9(@x-1)]; =M, igle) — g(te—-1)] (11) 


For a bounded function f(x), we have | f(z) | < L, where L is some 
positive number. On taking into account that g(£) — 9(x,—-,) > 0, 
given any law of subdivision ô we have the following inequality for 
sum (11): 


[sa] < È LEIE) — g(ex1)] = Lig) — gla)], 


|Sa| < L{g(b) — g(a)]. 


Along with sums (11), we form the following Riemann-Stieltjes 
sum: 


n 
oa = È KED (g(a) — Ilta) (12) 
where ¢, is a point of the interval [£k-ı; £]. On observing that m, < 
< (Ex) < My and g(xx) — g(%,-1) > 0, we have, for any subdivision ô: 
Sa <0, < Sa- (13) 


Certain new terms must be introduced. The subdivision ô’ is 
described as a continuation of subdivision 6 if all the points of sub- 


10 THE STIELTJES INTEGRAL [3 


division of ô are also points of subdivision of 6’. Let 6, and ô, be 
any two subdivisions. We form a new subdivision by taking as the 
points of subdivision the points of 6, and 6,. This new subdivision is 
called the product of subdivisions 6, and 6, and is denoted by the 
symbol 6,6,. The subdivision 6,6, is obviously a continuation of ô, 
and of 6,. Obviously, we can also introduce the concept of the product 
of any finite number of subdivisions 6, 6,...5,. It may be further 
remarked that the sums s, and S depend only on the choice of the 
subdivision ô, whereas the sum ø, depends also on the choice of 
points x. We shall now prove some extremely simple theorems. 

THEOREM 1. If the subdivision 6’ is a continuation of subdivision 6, 
then Sy > 8, and Ss < So. 

Let us prove say that s» > s, On passing from ô to 6’, every 
sub-interval of 6 can be split into a finite number of parts: 


tS P< Pc <a aw, 


and we obtain, instead of the term m,[g(xx) — 9(%x-1)] of ss, the 
following sum: 


Pe 
2 mi? ga) — g), 
where m“) is the strict sae’ bound of f(x) in the sub-interval (x, 


a], We obviously have m“ ) > Mp so that we have, on observing 
that the difference g(x) — g(s) is non-negative: 


= mY [g(0$) — gla] > = mul g(a) — g(af)] = 
= m,[g(Xn) — 9(X4-1)] 


and the theorem is proved [cf. I, 112]. 

THEOREM 2. If 6, and ô, are any two subdivisions, S, < Sa, 

The inequality s < S;, for the same subdivision ô, follows at 
once from the fact that m, < Mp and 9(x;,) — g(£r-1) > 0. We therefore 
have 83,5, < Ssa, for the subdivision ô, ô, On the other hand, by 
Theorem 1, Ss, < 83,4, and Sa, > Soa whence it follows that ss < 
< So. 

Let i denote the strict upper bound of sums s, for all possible laws 
of subdivision 6 and J the strict lower bound of sums S;: 


i=sups, I =infS,. (14) 


3) DARBOUX SUMS ll 


It follows at once from the definition of strict bounds and Theorem 2 
that s, < i < I = 8S, for any subdivisions 6, and 6,, and in particular: 


s <i<I <S. (15) 


Let us find the necessary and sufficient condition for equality of 
the strict bounds ¿ and J. The essential role is played here by the 


difference 
n 


Sy — 8 = > (My — mM) [9E — G(%e-1)) - (16) 


k=1 


THEOREM 3. The necessary and sufficient condition for i and I to be 
equal is that there exist a sequence of subdivisions 6, (n = 1, 2, ...), 
such that Sa, — 8, —> 0. 

Sufficiency. If a sequence of subdivisions ôn exists for which S., — 
— &, > 0, we obtain i= I by applying inequality (15) to this 
sequence. 

Necessity. Let i= I = A, By the definition of strict bounds, 
there exists a sequence of subdivisions ô} such that sy — A, and a 
sequence of subdivisions ôn such that Ss —— A. We take the sequence 
of subdivisions 6, = ô, ôn By Theorem 1, s+, > sẹ, and Ss < Sy, 
where sy and ss, < A, and So and Sa, > A. All the more, therefore, 
Sæ > A and Sa, —> A, so that 9, — s > 0, and the theorem is 
proved. It is worth noticing that the sub-intervals in the subdivisions 
ôn need not necessarily become indefinitely smaller. For instance, 
it may happen that all the subdivisions 6, consist of the same sub- 
division 6. The following corollary is an immediate consequence of (15): 

COROLLARY. If S., — Sa > 0, then i= I, 8, —> i and S, —> i. The 
above necessary and sufficient condition for i = I can be stated in 
terms of the sums o4. 

THEOREM 4. The necessary and sufficient condition for the difference 
Ss, — Sa, to tend to zero is that the o4, have a definite limit for any 
choice of points Eo”, and if this condition is fulfilled, the limit of os, 
is equal to i (or i = I). 

Necessity. If Sa, — Sæ > 0, as we have seen, sa, > i and S —> i, 
so that we have o, —> i for the o,, which satisfy the inequality 
Sa, < Ca, < Sa, To prove the sufficiency, let 


Pn 
on = È NEL) Lala) — g> A, 


where the 2{” are the points of subdivision of 6, and the ¿{® are 


12 THE STIELTJES INTEGRAL [3 


points of the intervals [2{”,, 2(]. Further we write mẹ® and MẸ? 


for the strict lower and strict upper bounds of f(z) in the sub-interval 
[at”,, af]. Let £ be any given positive number. By virtue of the 


condition 0; —> A, there exists an N such that 


|A—oa|<e for n>N (17) 


and any choice of points E™. By the definition of strict lower bound, 


we can choose points ¿° such that the inequalities are satisfied: 
0 < f(t) — m™ < e. We now have: 


Pa 
0 < 0a, — 8, = 2 LEP) — mW] (g(a) — g(a8,)] < 


Pn 

< & elge) — g(ahs)] = elg) — 9(a)), (18) 
and consequently, on writing A—s, as A — ss, = (A — os) + 
+ (05, — So)» we obtain, by (17) and (18): | A — sa | < | A — os, | + 
+ | oa, — 8, | < € [1 + 9(b) — g(a)] for n >N, whence it follows, 
since £ is arbitrary, that s, — A. It can be shown similarly that 
Sa > A, so that Ss, — S> 0, and the theorem is proved. The 
limit A is obviously the same as the numbers 7 and J, which are equal 
in the present case. The following corollary is an immediate con- 
sequence of this and the preceding theorem: 

COROLLARY. The necessary and sufficient condition for i = I is 
that a sequence of subdivisions ôn exist such that o, has a definite limit 
for any choice of points $°. If this condition is satisfied, the limit 
mentioned is equal to i (or I =i). 

THEOREM 5. If, for a sequence of subdivisions ôn, o, has a definite 
limit and ôn, is a continuation of ôn, then oy has the same limit. 

It follows from the conditions of the theorem and theorem 4 that 
Ss, — 83, > 0. By Theorem 1, s% > 8 and Sy < Ss. Consequently, 
all the more Sy — 85,—> 0, i.e. oy — i, and the theorem is proved. 

In the case of Riemann’s integral, i.e. g(x) = x, we proved earlier 
[I; 112] that s, > i and Ss, —> I for any bounded function f(x) as 
the sub-intervals become indefinitely smaller. Hence 1 = J is equi- 
valent in the case of the Riemann integral to the fact that the sum 
oy has a definite limit as the sub-intervals become indefinitely smaller, 
this limit being equal to i. This is not true in the general case. If os 
has a definite limit as the sub-intervals become indefinitely smaller, 
i = I by virtue of the corollary to Theorem 4. But the converse does 
not hold. The condition that i = I merely implies that a sequence 


4) THE STIELTJES INTEGRAL OF A CONTINUOUS FUNCTION 13 


of subdivisions 6, exists such that o; has a definite limit. We cannot 
assert that o, has a definite limit for any sequence of subdivisions 
on indefinite decrease of the sub-intervals. In the above definition of 
Stieltjes integral, we required that o, have a definite limit on indefinite 
subdivision of the interval. In later generalized types of integral 
we shall replace this requirement by the weaker requirement that 
i = I. In addition we shall extend the possibilities as regards sub- 
dividing the basic interval of integration, as will be explained when 
we give the new definitions. We turn in the next section to the 
Stieltjes integral, as defined in [2], and give an important sufficient 
condition for its existence. 


4, The Stieltjes integral of a continuous function, 

THEOREM 1. If f(x) is continuous in the finite interval [a,b], and 
g(x) is a non-decreasing bounded function, the Stieltjes integral of f(x) 
with respect to g(x) over the interval [a, b] exists. 

On taking into account inequalities (13) and (15), we can write 


[i — a| < Sa — s = PÀ (My, — mx) (9%) — g(£k-1)] - (19) 


Let £ be a given positive number. By virtue of the uniform con- 
tinuity of f(x), there exists in the interval (a, b] a positive number 7 
such that 0 < Mpk — Mp < e (k= 1,2, ...,n) if the greatest of the 
differences x, — k does not exceed 7. Inequality (19) now gives us 
|i — os | < e] g(b) — g(a) |, so that o, > 7 on indefinite subdivision. 
It can be shown similarly that o, > J, so that i = J. This equality also 
follows at once from the corollary to theorem 4 of the previous section, 
by virtue of the fact that ø; has a definite limit as the subdivisions 
become indefinitely smaller. 

It is not vitally important for the interval of integration in a 
Stieltjes integral to be finite. We only need to explain what 
is meant by the sub-intervals becoming indefinitely smaller when an 
infinite interval is subdivided. Let us take say [—°co, +20]. Given a 
sequence of subdivisions of this interval into a finite number of 
sub-intervals, we shall say that these latter become indefinitely 
smaller if, given any positive A, the greatest of the differences (x, — 
— Tk) tends to zero for the sub-intervals [2,_,, 2%] which have 
points in common with [— A, +A]. If g(x) is continuous in the interval 
{— ce, +2] and is strictly increasing, ie. 9(f8) > y(a)for B > a, the 
change of variable t = g(x) transforms the interval —co < x < +00 


14 THE STIELTJES INTEGRAL [4 


into the finite interval [a,b], where a = ẹọ(—œ) and b = ọ(+%). 
A subdivision of [—œ, +œ] with indefinitely smaller sub-intervals 
reduces to an ordinary subdivision of the finite interval [a,b] with 
indefinitely smaller sub-intervals. 

If, for instance, f(x) is continuous in the closed interval [—°co, +c], 
whilst g(x) is bounded and non-decreasing, the integral exists as 
before. This can be seen e.g. simply by replacing x with the new 
variable t = arctan x. On putting 


f(tant)=f,(t) and = g(tant) = 9,(t), 


we can write the integral over the infinite interval [— 20, +20] as 
an integral over the finite interval [— 2/2, +22]: 


+2 
+o tz 


S f(x) dh(a) = a (t) dgs(8) 


where /,(¢) is continuous and g,(t) is bounded and non-decreasing in 
[—2/2, +2/2]. 

We must mention a practically important modification of the 
fundamental existence theorem for the Stieltjes integral: 

THEOREM 2. If f(x) is continuous and bounded inside the interval of 
integration, and the non-decreasing function g(x) is continuous at the 
ends of the interval, f(x) is integrable with respect to g(x). 

Suppose that the interval of integration is [—co, +c]. 

Let us consider the terms on the right-hand side of (19). Since 
f(x) is bounded, we have | f(z) | < L, where L is a definite positive 
number, so that 0 < M, — mp < 2L. The terms of the sum (19) that 
correspond to the intervals [£k zk) having no points in common 
with [—A, A] yield a sum not greater than 


2L[g(— A) — g(— œ)] + 2L[9(+ œ) — g(4)] . (20) 


Since g(x) is assumed continuous, we can choose A at the points 
+2 so large that (20) is less than any given positive e. We fix A 
in this way and consider the remaining terms of sum (19). The intervals 
[k1 £k] corresponding to them are either wholly contained inside 
[—A, +A], or the two extreme sub-intervals fall partly outside [— 4, 
+ A], the length of the parts outside being not greater than n, where 
ņ is the greatest of the differences 2, — £k for the sub-intervals 
having points in common with [—4, +4]. As the sub-intervals 


4] THE STIELTJES INTEGRAL OF A CONTINUOUS FUNCTION 15 


become indefinitely smaller, this number 7 tends to zero, and it will 
always be less than unity as from a certain stage in the subdivision. 
Hence all the sub-intervals [z,_,, 2] that we are now considering 
will belong, as from a certain stage in the subdivision, to the interval 
[—A — 1, 4 + 1] in which f(x) is uniformly continuous. In view of 
this, we have 0 < Mp — mp < e for all sufficiently small values of n, 
and we now have, for the terms of (19) that correspond to sub-intervals 
[k-i £4] having points in common with [—A, +4]: 


0 < (My — my) [9(@4) — 9(%x-1)] < el g(x) — 9(%x-1)]» 
and the sum of these terms will be not greater than 
e[g(A + 1) ~ g(— A —1)]. 


Finally, inequality (19) gives us 
|i —o,| <ef1+9(A+1)—g(—A-—I])] < 
< e[l + g(+ œ) — 9(— ©)], 


whence it follows, since € is arbitrary, that o,;—> i, and the theorem 
is proved. 

Some supplementary properties of the Stieltjes integral may be 
mentioned, when f(x) is continuous and g(x) increasing. If | f(x) | < L, 
we have 





b 
J Hx) ag(a) | < L{g(b) — g(a)] , (21) 





which is obtained by passing to the limit in the obvious inequality 
for the sum os. The mean value theorem obviously holds [cf. I; 92]: 


b 
J f(a) dg(a) = f(§) [9(b) — gla)) (E in [a,b]). (21,) 


Now let a sequence of functions f,(”), continuous in [a,b], tend 
uniformly to the limit function f(x) in this interval. The latter function 
is also continuous in [a,b], and is therefore integrable with respect 
to g(x). Given any positive e, an N exists, by virtue of the uniform 
convergence of the sequence /f,(x), such that | f(z) — falz) | < e for 
x in [a, b] provided n > N. We obtain on making use of (21): 


b 


J Lf(w) — f.x)] dg(x) | < e[g(b) — g(a)], 


16 THE STIELTJES INTEGRAL (5 


whence, since « is arbitrary: 
b b 
lim J fale) dgl) = J f(a) dg(a) . (22) 


By using the same inequalities as when proving Theorem 2, it can 
easily be shown that (22) remains valid with the following assumptions: 
the functions f,(x) are continuous inside [a,b] and are bounded by 
the same number, i.e. | f,(z) | < L, where the positive number L is 
the same for all n; /,(z)—> f(x) uniformly in every closed interval 
lying inside [a, b], and g(x) is continuous at the ends of [a, b]. 


5. The improper Stieltjes integral. If f(x) is continuous inside 
[— œ, +-co] and bounded, whilst g(x) is non-decreasing and continuous 
at the ends of the interval, as we have seen, the integral of f(x) with 
respect to g(x) over [—°co, +] can be defined in the usual way, 
as the limit of the finite sums o. Now let f(x), continuous in [—°, 
-|-co], be unbounded, whilst g(x) is non-decreasing and bounded as 
before. Given any finite a and b, we can form the integral of f(x) 
with respect to g(x) over the interval [a,b]. If this integral has a 
definite finite limit as a tends to (—°°) and b to (+°), this limit is 
taken as the value of the integral over the interval (—°°, +9): 


+œ b 
f fx) dg(z) = lim J f(w)dg(a). (23) 
zi biss 
If the conditions indicated at the start of this section are fulfilled, 
so that the integral over [—°°, + °°] exists as the limit of the sum os, 
it may easily be shown that (23) holds. 
b 


Suppose that the integrals f | f(x) | dg(z) remain bounded with any 
a 
choice of a and b. In this case the integral exists: 
+o b 
f | fle) |dg(e) = lim {| fw) |dg(x), 
2 tto 


and integral (23) obviously also exists [cf. II, 82], being described as 
absolutely convergent in this case. 
We take any subdivision of the infinite interval by points 2, 


(k = ..., —3, —2, —1, 0,1, 2,3, ...): 
-Lo OX SMS XK a<... (24) 
(lim z= — œ and lim a= + œ}. 


dj THE IMPROPER STIELTJES INTEGRAL 17 


Let m; and M; be the least and greatest values of f(x) in the interval 
[zj-1, xı] and w; = M; — m;. We obtain by using (21,) of [4]: 














J f(z) D (ge) — ge)}} < olge) — gle)] 
and 
Xg q 
Jil = R ME) (ale) — glem) < 
q 
< 2P) — gem). (25) 


i=l 


Let the set of numbers w; (¢ = 0, +], +2, ...) have a finite strict 
upper bound w = sup a. By virtue of the continuity of f(x), we can 
construct in particular a subdivision (24) of the infinite interval in 
which œw is less than any previously assigned positive number. We 
introduce the notation: 


A= lim g(x); B= lim g(x); 


X+—00 X->+ 00 


q 
Spa = D HE) Ig) — g(a-1)1; 
i=l1—p 


q 
Spa = È ME) gE — gei). 
Further, let œw; be the value of w; for | f(x) | and 
w = supa}. 


We obviously have w; < w; and w’ < w. It follows from (25) that 














Ste f(x) dg(x) — < @(B — A) (26,) 
and similarly: Se 
x) | dg(a) — w'(B— A), (26,) 
whence it follows that 
Spa < S [fe dgl) + oB — A) (27) 
and 
S IKE |dgle) < 85, + 0B — 4). (28) 


18 THE STIELTJES INTEGRAL [6 


We now prove a theorem which gives the necessary and sufficient 
condition for absolute convergence of integral (23). 

THEOREM. The necessary and sufficient condition for absolute con- 
vergence of integral (23) is that there exist a subdivision with finite w 
and numbers ¢; for it satisfying xj. < $; < %;, such that the series 

+œ 


= HED lge) — gl2i—)] (29) 


==% 


is absolutely convergent. If this condition is satisfied, series (29) is 
convergent for any subdivision (24) with finite w and any choice of &i 
from the interval [x;_1,2;], and 


+20 too 
Í fe) Agta) = lim È fE) (gle) — gle] (30) 


Suppose that integral (23) is absolutely convergent. Inequality (27) 
now gives, for any subdivision with finite w: 


+2 
Spa < J |e) | dgl) + o'(B — 4), 


i.e. the sum Sj, ,, which increases as p and q increase, remains bounded, 
and series (29) is therefore absolutely convergent for any subdivision 
(24) with finite œ. Furthermore, (30) follows at once from (26,). 
Now suppose conversely that series (29) is absolutely convergent for 
some subdivision (24) with finite wœ and for a certain choice of &;. 
It follows at once from (28) that 


Xq +00 
J Aæ) (date) < Z AED gE) — glei] + @(B— A), 


whence it is clear that the integral on the left remains bounded as 
p and q increase, i.e. integral (23) is absolutely convergent. But now, 
as we have just seen, series (29) is absolutely convergent for any 
subdivision with finite w and any choice of &,, and (30) holds. 

Note. If f(x) is uniformly continuous inside [—°c°, +], and 6 
is the greatest of the differences (x; — x;_,), the condition ô— 0 
implies w -> 0, and we can write ô— 0 instead of w— 0 in (30). 
This will be the case, for instance, if f(z) = x. 


6. Jump functions. Let us carry out an elementary analysis of the 
properties of a non-decreasing function g(z). Since a monotonic 
bounded variable has a limit, the function g(x) will have a limit from 


6] JUMP FUNCTIONS 19 


the left and right at every interior point of the interval [a, b]: g(x — 0) 
and g(x + 0). There will be a limit from the right g(a + 0) at the 
left-hand end, and a limit from the left g(b — 0) at the right-hand 
end. If g(x — 0) = g(x + 0), g(x) is continuous at the point x. 

Similarly, continuity at the ends is guaranteed by the equations 
g(a + 0) = g(a) and g(b — 0) = g(b). We have g(x + 0) > g(x — 0) 
at points of discontinuity, and the positive difference S, = g(x + 0) — 
— g(x% — 0) is called the jump of g(x) at the point x. Jumps at the 
ends are similarly defined. 

A function g(x) can have an infinite set of points of discontinuity. 
Let us show that, in this case, the set of points of discontinuity of 
g(x) must be denumerable. The total increase of g(x) in the interval 
[a, b] is given by the positive number g(b) — g(a). The number of 
points of discontinuity at which the jump is greater than unity is 
therefore not greater than the integral part of the number g(b) — g(a), 
i.e. there is a finite number of such points of discontinuity. Similarly, 
the number of points of discontinuity at which the jump is greater 
than 1/2 is not greater than the integral part of the number 2 [g(b) — 
— g(a)] and so on. It may now easily be shown that the number of 
points of discontinuity of g(x) can be enumerated. We first enumerate, 
in any order, the finite number of points of discontinuity at which 
the jump is greater than unity. We proceed by enumerating the 
points at which the jump is greater than 1/2, and so on. 

When integrating a continuous function, we cannot use for the 
subdivision of the interval of integration the points lying inside 
[a,b] where g(x) is discontinuous, and the values of g(x) at these 
points therefore play no part in the formation of the integral. The 
situation is different at the ends of the interval, since they are 
necessarily included in the points of subdivision. We can assume say 
that g(z) is continuous from the right at the points of discontinuity, 
ie. g(x) = g(x + 0). Let A(z) be the function g(x) thus modified, 
i.e. h(z) = g(x) at points where g(x) is continuous and at the right- 
hand end, and A(z) = g(x + 0) at points of discontinuity. Only the 
change in g(x) at the left-hand end can have an effect on the size of 
the integral, and we have the obvious formula: 


J f(x) A(x) = f Hx) dg(æ) — f(a) [g(a +- 0) — gla)]. 


We now split g(x) into two terms when it is discontinuous; one term 
is a continuous non-decreasing function g(x), whilst the second g4(z) 


20 THE STIELTJES INTEGRAL [6 


gives the sum of the jumps of g(x) in the interval [a, x]. This latter 
term is usually called the jump function for g(z). Its precise con- 
struction is as follows. 

Let cp (k = 1, 2,3, ...) be a finite or denumerable set of points 
in the interval [a, b]. We define increasing functions p(x) and p(x) 
as follows: 


E 0 for £< cy (2) 0 for x < cp 

xz) = oS 

fi a, for £ > Ch Ve B, for x> Cy, 

where a, and fp are non-negative constants such that the series 
2 % and Š $r (31) 
kal k=l 


are convergent. If a constant a, is zero, the corresponding function 
g(x) vanishes identically, and the same for ;(x) if B, = 0. We shall 
include these functions in future formulae for the sake of sym- 
metry. If c,—=a, we shall assume that the corresponding a, 
vanishes, and if c, = b, we assume that the corresponding fx is zero. 
It follows at once from the convergence of series (31) that the series 


(x) = = pla); y(x) = 2 w2), (32) 


whose terms are non-negative increasing functions, are uniformly 
convergent for all x and, in particular, in [a, b]. If x differs from cp, 
all the terms of these series are continuous at the point x, and con- 
sequently, in view of the uniform convergence, the functions g(x) 
and y(x) are continuous at all x differing from cp. At a point x = c, 
the term ¢,(x) has a jump from the left equal to ax, the term p(x) has 
a jump from the right, equal to fp, and the remaining terms are 
continuous. In view of the uniform convergence, the sum of the 
remaining terms is also continuous at z= ck At a point % = Ck, 
therefore, p(x) has a jump from the left equal to a, and is continuous 
from the right, whilst »(z) has a jump from the right equal to Bx 
and is continuous from the left. The whole of this construction obviously 
retains its validity in the case when the set of points cx is finite. 
Now let g(x) be an increasing function and x = cy its points of 
discontinuity, whilst a, and fx are its jumps from the left and right 
at these points, i.e. a, = g(cx) — g(x — 0) and Bk = glen + 0) — g(cx). 
The difference g(b) — g(a) gives the total increase of g(x) in [a, b], 
and the sum of its total jumps yg = a, + Êk at the first n points 


6] JUMP FUNCTIONS 21 


Cy, Ug, - - +, Cn Of discontinuity is not greater than g(b) — g(a) for any n. 
Hence the infinite series consisting of the total jumps yx of g(x) 
must be convergent. The series consisting of the jumps from the left 
a, and the jumps from the right $y must be all the more convergent. 
We form the functions g(x) and y(x) and put galz) = o(x) + y(x). 
The quantity g4(z) is obviously equal to the sum of the jumps of 
g(x) at all points of discontinuity lying to the left of v, and the jump 
from the left at a itself if it exists, whilst the difference gg(8) — gala) 
is equal to the sum of the jumps at the points of discontinuity lying 
between a and £$, the jump from the right at the point a and the 
jump from the left at the point £. The difference 9(f) — g(a) gives 
the total increase of g(x) when x varies from a to f, whilst the difference 
ga(P) — gala) gives the increase of g(x) which is obtained by taking 
into account only the jumps at its points of discontinuity. We thus 
have the obvious inequality: 


9(B) — g(a) > gall) — gala) for B >a. 

Let g(x) = g(x) — ga(x). If x is a point at which g(x) is continuous, 
it is a point at which gq(x) is also continuous, i.e. at which g,(z) is 
continuous. Now let x be equal to one of the cx. At this point gq(x) 
has the same jumps as g(x) from the left and right, so that g,(x) is 
continuous at x = cx also. We can therefore say that g,(x) is continuous 
and increasing. We thus have the required decomposition 


9(%) = galt) + g(x) . (33) 


This decomposition can be performed for any interval, closed or 
not, finite or infinite. We can write for any continuous function: 


b b b 
J fle) dgl) = J He) dgale) + J fle) dget) . (34) 


Let us show that the first of the integrals on the right-hand side 
can be written as the sum 


b 
J Hx) dgg(x) = & tex) Vk (35) 


where cp are points at which g(x) is discontinuous and yẹ are the 
total jumps of g(x) at these points. We shall assume that the number 
of points of discontinuity is infinite. On putting (x) = (x) + 
+ y(x), we can write 

GJal%) = Sml T) + TmlT) » 


22 THE STIELTJES INTEGRAL [7 


where 


oo 


= 2 odz); T(t) = © ods). 


k=m4+1 


We have the inequality 0 < Tm(£) < yma, + Yma + .-., and, in 
view of the convergence of the series composed of the yg, given any 
positive £ we can fix an N such that, for any z, 


0 < Talx) <E form>N. (36) 


Further, since f(x) is continuous, we have [2]: 


b 
J f(a) deoy(ae) = flex) Yr, 
so that 


b m 
J f(x) ds„(£) = Z flex) Yk- (37) 


The function f(x) is bounded, i.e. | f(x) | < L, and for the terms of 
the last sum we have the inequality | f(cx) yk | < Lyx, whence it is 
clear that the series composed of the numbers f(cx) Yx is absolutely 
convergent. 

By (36), we have for the integral with respect to a non-decreasing 
function 7,,(z): 





b 
| Í f(a) dtn(2) 


a 


< Le (m> WN), 


whence, since € is arbitrary, it follows that the difference 


b 
J KE dga(x) a f(t) dsm(2) = j f(x) Ata (2) 


tends to zero as m increases, i.e. 


b 
She a) dg,(x) = lim f ha) ds,,(X) , 


mæ q 


whence, by (37), we have the formula 
b oo 
f fle) Age a) = Z flew) re. (38) 


7. Physical interpretation, A physical interpretation may be given 
of the function g(x) and the Stieltjes integral. Let matter be distributed 
over the interval [a,b], and let g(z) be the mass contained in the 


7] PHYSICAL INTERPRETATION 23 


interval [a, x], and g(a) the mass at the point x = a, if such a con- 
centrated mass is present. Otherwise, we put g(a) = 0. The difference 
g(d) — g(c) gives the mass contained in the interval (c, d]. When the 
positive number h tends to zero the interval (x, x -+ h] is compressed, 
and any point goes outside (v, xz +- h] for sufficiently small A, since 
the left-hand end is not included in the interval. The function g(x) 
is increasing (mass is positive), and, by what has been said above, 
it is natural to subject the function g(x), characterizing the mass 
distribution, to the condition g(x + k) — g(x) —> 0 or g(x) = g(x + 0), 
i.e. g(x) must be continuous from the right at all the points of discon- 
tinuity excepting x = b. There is no sense in talking of the continuity 
at the right-hand end of the interval, since the function is not defined 
for x > b. Inside the interval there are concentrated masses at the 
points where g(x) is discontinuous, and the size of the concentrated 
mass is given by the difference g(x) — g(x — 0). The same applies 
for the right-hand end of the interval. The total amount of matter 
in the interval [a, b] is equal to g(b). Everything that has been said 
is suitable either for a finite or an infinite interval. A characteristic 
feature of the above arguments is that we have made no use of the 
concept of density of the distribution. The centre of gravity of the 
distributed matter will be given by 


b 
1 
qb) fz dg(x) . 


a 





Te = 


This formula is suitable for a finite interval. In the case of an 
infinite interval, the integrated function f(x) = x ceases to be bounded, 
and we have to use the definition of improper integral. 

In the theory of probability, the function g(x) usually expresses 
the probability of distribution of some random magnitude, viz g(x) 
is equal to the probability of the random magnitude belonging to 
the interval (—cc, x]. Here, as above, g(x) is continuous from the 
right. The concept of the Stieltjes integral of a continuous function 
can be extended readily, as we shall see, to the case when g(x) is 
the difference between two non-decreasing functions: g(x) = g,(x) — 
— g(x). A physical interpretation of g(x) is easily given in this case. 
Suppose that positive and negative charges are distributed in the 
interval (—°°, +0). Now, g,(z) defines the total positive charge in 
the interval (—°co,z], and g(x) the total negative charge in this 
interval. 


24 THE STIELTJES INTEGRAL [8 


8. Functions of bounded variation. We have so far assumed that 
the integrating function g(x) is increasing. In order to pass to integrals 
with more general functions g(x), we must introduce a class of functions 
which is in fact the fundamental class to which all our integrating 
functions g(x) will have to belong. Let g(x) be a given function in the 
finite or infinite closed interval [a, 6] which takes a finite value at 
every point of the interval. Let ô be a subdivision of [a,b]: a = 
= To <L Ty <... < En < Ln = b. We form the sum: 


ta = Z |92) ~ g(t) | (39) 


DEFINITION. If the set of values of this sum is bounded for all possible 
subdivisions 5, the function g(x) is said to be of bounded variation in the 
interval [a, b], whilst the strict upper bound of sums (39) ts called the 
total variation or simply the variation of g(x) in [a, b]. 

We shall write it symbolically as V3(g). Some simple properties of 
the sums ż; and of the total variation must be mentioned. If we 
introduce a new point of subdivision c between the points x, and 
Zk- it follows at once from the formula 


G(x) — (@x—1) = Lg(%x) — gle)] + [9(c) — g(%e—-1)] 
that 
| 9(2%) — G(%n—1) | < | g(x) — gle) | + | gle) — 9(%x-1) |; 

ie. the sum tł, does not decrease on the addition of new points of 
subdivision. Further, if the sums ¢,, consisting of non-negative terms, 
remain bounded for the interval [a,b], they will all the more be 
bounded for any interval [a, f] making up part of [a, b], i.e. if g(x) 
is of bounded variation in [a, b], it will be of bounded variation in a 
part [a, 8] of [a, b] and Vi(g) < Valg). 

If we take the interval [a,b] in its entirety, this is one of the 
possible subdivisions 6, and since we obviously have t, < V%(g) for 
any subdivision, we must have in particular: 


| g(b) — g(a) | < Valg) - (40) 


If g(x) is a monotonic function in [a, b], all the differences g(x) — 
— 9(%,-,) have the same sign, and the sum t, is equal to g(b) — g(a) 
for any 6 in the case of an increasing function, and equal to g(a) — g(b) 
for a decreasing function, i.e. any monotonic function is a function 
of bounded variation. 

We shall now state as separate theorems a number of properties of 
functions of bounded variation. 


8] FUNCTIONS OF BOUNDED VARIATION 25 


THEOREM 1. If g(x) is of bounded variation in [a,b], it ts bounded 
in this interval. 

Given any 7v of [a, b], we can write g(x) = g(a) + [g(x) — g(a)], i.e. 
| g(a) | < | g(a) | + | glx) — gla) |, or, by (40), | g(x) | < g(a) + Vilg)< 
< g(a) + V2(g), which proves that g(x) is bounded. 

THEOREM 2. If g(x) and h(x) are of bounded variation in [a, b], then: 
cg(x) (c is a constant) and g(x) + h(x) are also of bounded variation. 

We shall give the proof for the sum. We form t, for g(x) + A(z): 


iy = PA [glan) + MEn) — [g(%e—1) + h(t] | < 


< = | g(a) — Itr) | + E (ae) — Maea) l: 


The last two sums are bounded, since g(x) and h(x) are of bounded 
variation by hypothesis. Hence t, is all the more bounded, i.e. g(x) + 
+ h(x) is of bounded variation. 

COROLLARY. Every finite linear combination of functions of bounded 
variation, i.e. every expression of the form c,f,(z) + cz falx) +... + 
+ cp f,(x) is also a function of bounded variation. 

THEOREM 3. If g(x) and h(x) are of bounded variation, their product 
g(x)h(x) is also of bounded variation. If, moreover, | h(x) | > m> 0, 
the quotient g(x)/h(x) is of bounded variation. 

We consider the product, for which we form ¢,: 


i= = | g(a) MT) — G(x) h( 2x1) | - (41) 


Since g(x) and h(x) are bounded, we can write {| 9(z)| <L and 
| h(x) | < L, where L is a positive number. 
We have the obvious equation: 


(Lx) A(X) — GlEr—1) AEk) = Gay) [A ay) — h(ay—1)] + 
+ Mak) [9(%%) — g(t)» 


which, in conjunction with (41), gives us 


b < 2 | g(ax) | | altr) — Mari) | + 2 | A(t) || Glan) — gl£r—1) | 
or 


ty < LS hlz) — har) |+ E 2 | g(x) — 92-1) |. 


26 THE STIELTJES INTEGRAL [8 


But the sums written are bounded, since g(x) and A(x) are of bounded 
variation by hypothesis, so that the sums t, are also bounded, which 
proves the theorem. 

THEOREM 4. If a < c < b and g(x) is of bounded variation in [a, b], 
it is of bounded variation in [a, c] and [c, b] and conversely, if it is of 
bounded variation in [a, c] and [c, b], it is of bounded variation in [a, b]. 
We have the formula here: 


V5g) = Velg) + Vig). (42) 


We saw above that, if g(x) is of bounded variation in [a, b], it is of 
bounded variation in [a, c] and [e, b]. It remains to prove the converse 
and (42). We write t, for the sum (39) for [a, b], and ef) and 1?) 
for the corresponding sums for [a, c] and [c, b]. If c is a point of the 
subdivision 6, 6 splits into a subdivision ô, of the interval [a, c] and a 
subdivision 6, of [c,b], and we have t, = ¢§? +t. If g(x) is of 
bounded variation in [a,c] and [c,b], the previous formula gives 
ts < Vi(g) + V2(g). The sums t, therefore remain bounded if c is a 
point of subdivision. They are all the more bounded for other sub- 
divisions, since the addition of a point of subdivision can only increase 
ts. It follows from this argument that g(x) is of bounded variation in 
[a,b] and that V2(g) < Vé(g) + V2%(g). We shall now prove the 
reverse inequality, whence (42) will follow. Let e be a given positive 
number. By the definition of strict upper bound, we can choose sub- 
divisions 6, and ô, in the formula t; = i$? + ¢ such that tP > 
> Vi(g) — e and P > Vilig) — e. We now obtain: t, > Vi(g) + 
+ V°(g) — 2e, whence V2(g) > Ve(g) + V2(g) — 2e, or, since e is 
arbitrary, V2(g) > Vé(g) + V°(g), which finally proves the theorem. 

CoRoLLARY. We have proved the theorem for the subdivision of the 
interval [a,b]into two parts. By applying it several times we can obtain 
a similar result for the subdivision of [a, b] into a finite number of 
sub-intervals, i.e. if [a, b] is split into a finite number of sub-intervals 
and g(x) is of bounded variation throughout the interval, it will be of 
bounded variation in each sub-interval, and conversely; furthermore, 
the total variation over the whole interval is equal to the sum of the 
total variations in each sub-interval. This property is usually de- 
scribed as the property of additiveness of the total variation. We can 
write it in the form 


Vlg) = Vag) + Vag) +---Veg) - (43) 


8} FUNCTIONS OF BOUNDED VARIATION 27 


THEOREM 5. The necessary and sufficient condition for g(x) to be of 
bounded variation is that it is expressible as the difference between two 
increasing functions. 

The sufficiency is obvious. Increasing functions are functions of 
bounded variation, and by the corollary to Theorem 2, the difference 
between two such functions is also of bounded variation. Let us prove 
the necessity, i.e. if g(x) is of bounded variation, it is expressible as the 
difference between two increasing functions. If we put 


g(x) =F (VEG) + gle); gale) = (VE) — g(a], (44) 


we have 
gl£) = 94(%) — ga(T) , (45) 
and it is sufficient to show that the functions g,(z) and g(x) are 


increasing. We shall prove this for g(x). Let a and £ belong to [a, b] 
and a < f. We have 


gu(B) — gua) = $V 2g) — Vag) + g8) — gla)], 


or, in view of the additiveness of the total variation: 


gx(8) — gla) = > [V8g) + (8) — gla)). 


But, by (40), V8(g) > | g(8) — g(a) |, whence it follows that g,(8) — 
— g(a) > 0. 

The increasing functions g,(z) and g(x) can only have a finite or 
denumerable set of points of discontinuity, and they have a limit from 
the left and right at every such point. The same can therefore be said 
of the function g(x). 

THEOREM 6. If g(x) is continuous at a point x= c, the function 
Vilg) = v(x) is also continuous at this point, and conversely. If g(x) is 
continuous from the right (left), v(x) is also continuous from the right 
(left), and conversely. 

Suppose that c < b. Let us consider say continuity from the right. 
Given a positive £, we can subdivide [c, b} in such a way: 

C= Eo L my... < Ur = b, that 


| 9() — glr) | > Vg) — e. (46) 


If we add new points of subdivision, this inequality will be all the 
more satisfied. We can therefore assume that the point q, is taken so 


28 THE STIELTJES INTEGRAL [8 


close to c that | g(x) — g(c) | < e. Use is made here of the continuity 
of g(x) from the right. Inequality (46) can be rewritten as 


| g(x) — g(e)| + PAVIE) — glr) | > V2(g)—e, 


whence we obtain, since | g(x) — g(c) | < e: 


= | gan) — g(x) | > V2(g) — 2¢. 


The sum on the left is a sum ¢, for the interval [z,, b], and it follows 
from the last inequality that 


V? (g) > Ve(g) — 2e, 


or, since the total variation is additive, we have V2'(g) < 2e, ie. 
v(x) — v(c) < 2e. The function v(x) is increasing, and it follows from 
the last inequality that v(c + 0) — v(c) < 2e, whence, since e is arbi- 
trary, we have v(c + 0) = v(c), i.e. v(x) = Vi(g) at the point «= c of 
continuity from the right. Conversely, if we are given that v(z) is con- 
tinuous from the right, we have by (40): | gc + hk) — g(c) | < v(e + h)— 
— v(c), and as the positive number h tends to zero the right-hand side 
tends to zero, so that the left all the more tends to zero, which proves 
that g(x) is continuous from the right at x = c. 

If g(x) is continuous at the point c, by what has been proved, the 
functions g,(x) and g,(x) defined by (44) are also continuous at x = c. 
This statement obviously holds for continuity from the right or left. 

THEOREM 7. If g(x) is of bounded variation, and 


g(x) = gi (z) — g2(z) (47) 
is any representation of g(x) as the difference between two increasing func- 
tions, we have the following inequalities for anya < $ belonging to [a, b]: 

g1(B) — g(a) < gi(8) — g(a); 

92(B) — g(a) < g3(B) — g?(a). 

We confine ourselves to the proof of the first inequality, which 
can be written as 


+ [V4g) + 9(8) — g(a)] < gh(B) — g¥(a) . (49) 


We use reductio ad absurdum. Suppose that the reverse inequality 
holds: 


(48) 


+ [V2(g) + 9(8) — gla)] > g(6) — ga). (60) 


8] FUNCTIONS OF BOUNDED VARIATION 29 


We choose a subdivision 6 of the interval [a, £] such that the sum ty 
is so close to the variation V£(g) that inequality (50) still remains in 
force when this total variation is replaced by the sum in question. 
Thus we have, for some subdivision a = £o < 2 < 1... < Ln- < Tn = 


=f: 
LZ INED — gta) l + 9E) — 9(a)] > 988) — gia. (61) 


On the other hand, we can evidently write 
n 
g(B) — g(a) = 2% 19l) — g(exa)]: 


g%(B) — gž(a) = 2 lgt lE) — gt(2x—-1)]. 


so that inequality (51) can be written as 
l n 
FU oles) — gens) |+ gE) — gler) 1> 


> 2 lgt lEn) — gi(a—1)) - 


At least one of the terms on the left-hand side must be greater than 
the corresponding term on the right. Let this be with k = p. This 
leads us to the inequality: 


Ul gap) — glp) | + gly) — Glp—1)] > 


> GT (2p) — gt (Ep1) - (52) 


If g(%p) — g(xp-,) < 0, this inequality is absurd, since its left-hand 
side is zero, whilst the right hand side is non-negative because gf(z) is 
increasing by hypothesis. It remains to suppose that g(£p) — g(%p1) > 
> 0. In this case (52) can be written as 


9(% p) ae g9(%p—1) > 93 (Xp) eS G3 (Xp—1) , 
or, by (47), it reduces to 
a [g (xp) ra 93(Xp—1)} >0, 


which is absurd, since g$(x) is increasing by hypothesis. We have thus 
arrived at an absurdity, and inequality (49), and hence the whole of 
the theorem, is proved. 


30 THE STIELTJES INTEGRAL [9 


Expression (45) for g(x) as the difference between the two functions 
g,(x) and g(x) given by (44) is usually called the canonical form for a 
function of bounded variation as the difference between two increasing 
functions. By the above theorem, the g (x) and g(x) appearing in the 
canonical form are not more rapidly increasing than the functions 
appearing in any other form. If we add the same constant c to g(x) 
and g(x), this obviously has no effect on their difference, nor on their 
increment in any part [a, f] of the interval [a, b], and the form ob- 
tained for g(x) as the difference between g,(z) + c and g(x) -+ c can 
also be described as canonical. 

Note. Each of the increasing functions g,(z) and g,(z) can be 
split into a jump function and a continuous part: 


G3(%) = Gyql®) + GilX); Go(%) = Goa(X) + Goe(*) . 


This leads us to a completely determine decomposition of g(x) into a 
jump function and a continuous part: 


G(X) = [Gra(%) — Yoal%)] + [91e(%) — Yocl%)] - (53) 


9. An integrating function of bounded variation. If f(x) is a con- 
tinuous function in [a, b] and g(x) is of bounded variation, by using 
the expression for g(x) as the difference between two increasing func- 
tions we can write 


PAI (glatx) — gl£r—1)] = 2 KE) (91(%%) — 9al£r—1)] — 


k= 


n 


= È NEn [aE — 9r(x—a)1 (54) 
The sums on the right have a definite limit as the subdivision be- 
comes indefinitely finer, so that the same can be said of the sums on 
the left, ie. a continuous function is integrable with respect to a 
function of bounded variation. Passage to the limit in (54) gives 


b b b 
J Hæ) dg(a) = $ jæ) datz) — $ fl) dga) . (55) 
Let us indicate the changes that have to be introduced into the 


statement of the properties of the Stieltjes integral if g(x) is a function 
of bounded variation. 


9] AN INTEGRATING FUNCTION OF BOUNDED VARIATION 31 


We have 





= fe) (9%) — 9(%n-1)]] < LS | NE) — glr) | < LV4(g), 


if | f(z) | < Z. Passage to the limit gives 
b 








§ f(x) dg(x) | < LV29). (58) 

This formula replaces (21) of [4]. Let us also recall the formula [2]: 
b p p b 

Jia) d È agde) = Z a ffx) dga) (57) 


If g(x) are of bounded variation, a linear combination of them: 
a, g(x) + ... + apgp(x) is also a function of bounded variation. 

We now consider the Stieltjes integral with variable upper limit 
when f(x) is continuous and g(x) is of bounded variation: 


= J f(t) dg(t) . (58) 


We shall prove that F(x) is a function of bounded variation. We 
form the sum t, for [a, x): 


-> 


k=1 


a 


Xk—s 





f(t) dg(t) | ; 


On applying (56), we get 
<L = V2 (g) = LV4(g), 


whence our assertion follows. At the points where g(x) is continuous, 
the function V3(g) is also continuous, and an immediate consequence 
of the inequality 

x+h 


J Ha) dg(x) | < LV%+*(g) 








is that F(z) is continuous at these points. Let us also prove the following 
statement: if f(z) and g(x) are continuous in [a,b] and g(x) is of 
bounded variation, the formula holds: 


b x b 
J p(x) af J f(t) dg(t)] = J p(x) f(x) dg(æ). (59) 


32 THE STIELTJES INTEGRAL [10 


It is sufficient to prove this formula for the case of increasing g(x). 
We form the sum o; for the integral on the left of (59): 


Oy = P(E) S f(t) dg(t) , 


X= 


or, by the mean value theorem [4]: 


O = PAG FEO LIE) — g(%n-1)]- (60) 


The points x and éķ belong to the same interval, and by arguing 
precisely as we did for (9) of [2], it may be seen that the sum on the 
right-hand side of (60) yields in the limit the integral on the right-hand 
side of (59), and the last formula is thereby proved. 


10. Existence of the Stieltjes integral. We have so far considered the Stieltjes 
integral of a continuous function f(x) with respect to a function of bounded 
variation g(x). It follows from the formula for integration by parts [2] that 
the function of bounded variation g(x) is integrable with respect to the continu- 
ous function f(x). We shall give below some simple conditions for the existence 
of the Stieltjes integral in other cases. We shall assume that f(x) and g(x) are 
bounded in the finite interval [a, b] and that g(x) is a non-decreasing function. 
Since we shall make use in future of the Stieltjes integral only in the case when 
f(z) is continuous, the results quoted below are given without proof. 

I. A necessary condition for the existence of the Stieltjes integral is that f(x) 
be continuous at every point of discontinuity of g(x), and if this condition ta ful- 
filled, f(x) is integrable with respect to the jump function g,(x) and the integral 
of f(x) with respect to gq(x) is given by formula (35). 

If this necessary condition for integrability is satisfied, the question of the 
integrability of f(x) with respect to g(x) reduces to the question of the integra- 
bility of f(x) with respect to the continuous non-decreasing function g,(z). 
If, for instance, f(x) is a function of bounded variation, we have integrability, 
as already indicated. The necessary and sufficient condition for integrability 
of f(x) with respect to g,(z) is as follows. 

Il. The necessary and sufficient condition for integrability of f(x) with respect 
to g(x) ta that, given any positive £, the points of discontinuity of f(x) can be 
covered by a finite or denumerable set of intervals [a,, bg] (which may overlap) 
such that 

Y (aclox) — gedar) < €, (61) 
k 


We now suppose that g(x) is a function of bounded variation. Suppose we 
have the canonical form of this function as a difference (45) and some other like 


11] PASSAGE TO THE LIMIT IN THE STIELTJES INTEGRAL 33 
form (47). We form the difference S; — 8, for g,(x) and g¥f(x): 


n 


Sa — % = 2, (My — mx) (gil) — gileg-r)] » 


n (62) 
S$ — sf = X (Mn — m) [94E — gflay-1)] - 
k=1 


We have seen above that 
Alp) — Gi(2k-1) < Gi (2x) = Gi (2p-1) , 


so that, if S} — s} + 0 on indefinite subdivision, all the more S4 — 8, -> 0, 
ie. if f(z) is integrable with respect to 93(2), it is also integrable with respect 
to g,(z). Similarly, if f(x) is integrable with respect to g(x), it is also integrable 
with respect to g(x). But integrability with respect to g(x) (k = 1, 2) does 
not imply integrability with respect to g}(x). Thus, to investigate the integ- 
rability of f(x) with respect to g(x), we have to investigate the integrability 
of f(x) with respect to g,(z) and g,(x). If f(x) is integrable with respect 
to g(x) and g,(x), the integral of f(x) with respect to g(x) exists, and is given 
by (55). 

ILI. The integrability of f(x) with respect to g(x) and g,(x) is equivalent to the 
integrability of f(x) with respect to the total variation V%(g) = g,(z) + g(x). 

The proof of this proposition is simple, as is that of proposition I. The proof 
of proposition II is much more difficult. 


11, Passage to the limit in the Stieltjes integral. This and the next few 
sections will give some theorems on passage to the limit under the sign of the 
Stieltjes integral. We have already had one of these theorems. It was concerned 
with the case when the integrable functions tend uniformly to the limit func- 
tion f(x). Let falx) be continuous in [a, b], let f,(x) + f(x) uniformly in [a, b], 
and let g(x) be of bounded variation in [a,b]. We have on the basis of [4] 
and (55): 


b b 
lim f f9(x) dg(x) = $ fa) dg(a) . (63) 
>> q a 


We shall indicate some simple generalizations of this statement, confining 
ourselves to an infinite interval. 

THEOREM I. Let f,(x) be continuous inside [— œ, -+ œ] and bounded by the 
same number: |fa(x)| < L, independently of n, let f,(x) — f(x) uniformly in 
any finite interval and g(x) be of bounded variation in [— œ, + œ], and be continuous 
at the ends of this interval. Formula (63) now holds for the interval [— œ, + œ]. 

The function f(z) is continuous and bounded inside [— œ, +], so that it 
is integrable with respect to g(x). On recalling that g(x), and consequently its 
total variation, is continuous at the ends of the interval, and that | f(x) — f,(z)|< 


34 THE STIELTJES INTEGRAL {il 


< 2L, we can say by using (56) that, given any positive £, there exists a positive 
A such that, for any n: 


-A 
| f He) — hal dgte)| <e | S E — Ha] dga) | < e. 


The passage to the limit f,(x) — f(x) is uniform in [—A, +A], so that we 
have, by the above-mentioned theorem: 


+A 
| J) Lele) — fale) dga) | < e 
for all sufficiently large n, i.e. 


+a 


| f e) — f_(w)] dg(a) | < 3e , 


whence the theorem follows, since e€ is arbitrary. We shall now prove a similar 
theorem for the case when the f,(2) are unbounded in [— œ, + 0], and the 
integral over this interval has to be understood as an improper integral. 

THEOREM 2. Let f,(x) be continuous inside [— œ, -+ œ], let the improper 
integrals 


a £ 
S tale) dale) = lim J fale) age) (64) 
B->+ 


exist uniformly with respect to n, f,(x) + f(x) uniformly in any finite interval 
and g(x) be a function of bounded variation in any finite interval. Now, the 
(improper) integral of f(x) with respect to g(x) over the interval [— œ, + 0] 
exista, and (63) holds. 

The function f(x) is continuous in any finite interval and integrable over 
such an interval with respect to g(x). We show that it is integrable over an 
infinite interval. Let € be a given positive number. Since integrals (64) are 
uniformly convergent with respect to n, a positive A exists, such that, for 
any interval {B’, B”] lying outside [—A, +A], and any subscript n, we have 


B” 
| a fala) dg(x) | < e. (65) 


We fix B’ and B” in some way so that the interval [B’, B"] lies outside 
[—4, +A]. Now, in view of the uniform convergence f(x) — f(x) in the interval 
[B’, B”], we have for all sufficiently large n: 


B” 
| J E) — frla) dg) | < e- (66) 


On taking into account the obvious equation: 


B” 


B” B” 
f fæ) dg(x) = $ fala) dg(x) + $ IHE) — frlx)] dgl), 
B’ B’ B 


12] HELLY’S THEOREM 35 


we obtain by virtue of (65) and (66): 
B 
| J Hx) dgl) | < 26, 


whence it follows that the integral of f(x) with respect to g(x) over the interval 
[— ©, + œ] exists. To prove (63), it is sufficient to notice that the integral 
of the difference f(x) — f,(z) will have a sufficiently small absolute value in 
sufficiently remote sub-intervals, whilst in the finite sub-intervals it will be 
small for all sufficiently large n by virtue of the uniform convergence of f(x) 
to f(x). 

It is worth mentioning that there is no need for uniform convergence of 
Jn(x) to f(x) for the validity of (63) in the case of a finite interval. It is sufficient 
to require that the continuous functions f (x) tend to the continuous function 
f(x), whilst remaining bounded, independently of n, i.e. there must exist a 
positive L such that | f,(z) | < L for any n and all x of [a, b]. We shall prove 
this assertion later [50]. 


12. Helly’s theorem. We now consider a theorem on passage to the limit in the 
case when the integrating function g(x) varies. We must investigate as a pre- 
liminary the convergence of a function of bounded variation to a limit func- 
tion. Let g,(x) be a sequence of functions of bounded variation in the interval 
[2, b], the variation of all these functions being bounded by the same number 
L, independent of n: 

Vla) <L. (67) 


Suppose that, at every point of [a, b], gn(x) tend to the limit function g(x) 
which has finite values. It may easily be seen that g(x) is also a function of 
bounded variation. In fact, we have the inequality for the sums i) for the 


9n(z): 


m 
P = Y | gn (2i) — Gn (Lear) | < L, 
k=l 
whence we obtain, on passing to the limit, the same inequality for the sum t 
for g(x): 


t= X 1g (h) — 9 (k1) | < L, 
k=ł 


whence it follows that the variation of g(x) is also not greater than L. If the 
gn(x), instead of tending to g(x) at every point of [a,b], only tend to g(x) in 
a set & of points x, (k = 1, 2, 3, ...) dense in [a,b], it is no longer possible 
to assert that g(x) is a function of bounded variation. We shall assume in future 
that g(x) is in fact of bounded variation in this case. It may be mentioned that 
a set ë of points 2, is said to be dense in [a,b] if any part of [a,b] contains 
an infinite set of points of ë. Let g,(x) be a sequence of increasing functions, 
which tends to a finite-valued limit function g(x) at every point of [a,b]. 
The limit function here will also be increasing, and hence will be a function of 
bounded variation. Let us prove the following theorem: 


36 THE STIELTJES INTEGRAL [12 


THEOREM 1. If g(x) are increasing functions in the interval [a,b], and tend 
to a function g(x) on a set of points E dense in [a, b], the convergence holds at every 
interior point of [a,b] where g(x) ts continuous. 

Let g(x) be continuous at the point x, and let x’, x” be points of the set ë 
to the left and right of x, i.e. 2’ < ay < x”. We have g,(x’) < gp(%o) < gn(x”), 
so that 

G (£o) — In (2) < g (20) — Gn (Lo) < YF) — Gn (2’)- 


We can rewrite this inequality as follows: 


[g (£o) — g (2°)] + [9 (27) —gn (2")] < g (20) — gn (To) < 

< [g (£o) — 9 (2°)] + Eg (2°) — on (x°)]. (68) 

Let c be a given positive number. The points x’ and æ” of &, which is every- 
where dense in [a,b], can be taken so close to x, that |g(x,) — g(x") | < e 
and | g(x) — g(x’) | < e, since 2, is a point where g(x) is continuous. Having 
thus fixed z’ and x”, we have the inequalities, for all sufficiently large n: 
| g2) — g(x’) | < e and | g(x”) — gp(x*) | < e, since the g,(x) tend to g(x) 
at x’ and x”. These last inequalities, in conjunction with (68), at once give us 


— 2e < g (Xp) — In (Xp) < + 2e, 
whence, since e is arbitrary, it follows that gn(®o) — g(x,). We shall now state 
a fundamental theorem on passage to the limit. 
Tarorem 2 (Helly). Let f(x) be continuous in [a,b], gn(x) be of bounded 
variation, the variations V°(g,) being not greater than some number L independent 
of n, and g (x) — g(x) at all points of [a, b]. The formula now holds: 


b b 
lim $ f (2) dgn (2) = $ f (2) dg (2). (69) 
n+= a a 

As indicated above, g(x) is a function of bounded variation, so that f(x) is 
integrable with respect to g(x). We divide the interval into sub-intervals: 
a =£ LT <L... <Emı < £m = 6 and write the obvious formula: 


m Xk 


b 
f flw)dg(@)= S (f(a) dg(x) = 


kal Xp, 


m XE 


=>f væ- tænde + X fe f dg (2), 


k=l Xim =! 
i.e. 


m Xk 


froum- X f Ue) — i l@)] 4g (a) + 
kml Xp, 


m 
+ > f(a) [g (2h) — 9 (@e-)1- (70) 
k=l 


Given any positive £, we can fix such a fine subdivision of [a, b] that | f(z) — 
— f(xy) | < e for any k. This follows at once from the uniform continuity of 


12} HELLY’S THEOREM 37 


Jiz) in [a,b]. We now obtain: 


Xk 
1S Uf (@) — F (m)] dg (2) | < eV38_ (9), 


Xba 
whence 
| Sf te- reaa] <2 SV 
R k=] X} 
i.e. 


m Xk 
IS f Uf) — flag] dg (2) | < e Vg (9) < eL. 


k=1Xp_4 


With the subdivision fixed above, we can write (70) in the form 
b m 
f f(z) dg (x) = ee L + PAL (9 (ay) — g (2-1)] 
a = 
where | 0 | < 1. Similar arguments lead us to the formula 
b m 
S F (2) dgn (£) = OnE L + 2 f (Ek) [9n (Ek) — Gn (Ek-1)); 
a -= 
where | 8n | < 1. Subtraction term by term gives us 
b b 
$f (2) dg (x) — f f (2) dgn (2) = 
a a 


== (0 — On) eL + RA Í (Eu) {19 (4) — In (2K)] — (9 (Ek-1) — In (x-1)I}- 


The points x; are fixed, and, since the g,,(x) converge to g(x) at these points, 
the sum appearing in the last formula has an absolute value less than e for all 
sufficiently large n. The last formula thus leads us to the inequality, for these 
n: 

b 
| # (@) dg (a) — fre dg, (2) |< € (2L + 1), 


a 


whence (69) follows, since € is arbitrary. 

Note. Suppose that the g,(x) tend to g(x) only on a set of points dense 
in [a, b], instead of at every point, both ends of the interval being included in 
the set; the limit function g(x) is assumed to be of bounded variation. Now, 
if we take points of the set as points x, of subdivision of the interval, the above 
proof remains in force, and we arrive at (69) as before. We shall now mention 
some generalizations of the theorem, similar to the generalizations of [11]. 

THEOREM 3. Let f(x) be continuous inside the interval [— œ, + 0] and bounded, 
let gn(x) and g(x) be increasing functions in this interval and continuous at the 
ends, and let g,(x) — g(x) on a set of pointe dense in the interval, and in particular 
at both ends. We now have the formula: 


lim 142) dn (2) = Trea £2). (71) 


Tipo — oe 


38 THE STIELTJES INTEGRAL {12 


It is worth remarking that, in the present case, the total variation of g,(x) 
is given by the difference g,(+- ©) — g,(— œ), and it follows at once from the 
conditions of the theorem that all these total variations are not greater than 
some L independent of n. The integrals of f(x) with respect to the g,(x) and g(x) 
exist. Let us find an inequality for the difference between these integrals, by 
splitting the interval of integration into three parts: [— œ, a], [a,b], [b, +o], 
where the points a and b belong to the set in which g,(x) > g(x): 


+ +o 
| f fe) dg (x) — f Fla) dgn (a) | < 


a a 6 6 + +o 
<| f- fi+if-fi+if -fı (72) 
ae po a a b b 
The function f(x) is bounded, i.e. | f(x) | < L. We can write for the first of 
the differences: 


a a 
| Sf a)dg(z)— $ f (x) dgn(x)| < 


< L {lg (a) — g (— 2)] + ((9n (a) — gn (— 29) J}, 
which can be put in the form 


a a 


| S— $ i< L{2[g(a)— g(— )] + gn (a) — g (a)) + 
+ [9 (— œ) —gn (— 09) ]}. 


Since g(x) is continuous at x = — œ, we can fix an a so close to (— œ) that 
the positive difference g(a) — g(— ©) is less than any previously assigned posi- 
tive number. Having fixed this a, we further remark that the differences 
gn(a) — g(a) and g(— œ) — g,(— œ) are also as small as desired in absolute 
value for all sufficiently large n. A similar treatment can be given of the third 
term on the right-hand side of (72). Hence, given any positive e, we can fix 
a and b of the above-mentioned set, everywhere dense in [— œ, +œ), such 
that the first and third terms on the right of (72) are less than e for all sufficiently 
large n. The theorem proved above is applicable to the finite interval [a, b], 
i.e. the second term on the right of (72) is also less than e for all sufficiently 
large n. Thus the left-hand side of (72) is less than 3e for all sufficiently large 
n, whence (71) follows, in view of the arbitrariness of €. 

We now turn to a second generalization of Theorem 2, which is concerned 
with improper Stieltjes integrals. 

THEOREM 4. Let f(x) be continuous inside the interval [— œ, + 00], let g,(x) 
and g(x) be of bounded variation in this interval, the variations of the g(x) being 
not greater than some number independent of n. Further, let g,(x) — g(x) on a 
set dense in [— 0, + œ], and let the improper integrals 


+o b 
§ fle) dgq(z) = lim Jf (x) dg, (x) 


— %0 a—>—aq 


b- + æ 


13] SELECTION PRINOIPLE 39 


be uniformly convergent with respect to n. The function f(x) is now integrable with 
respect to g(x) over [— œ, +œ], and (71) holds. 
Given any positive e, there exists a positive A such that 


B” 
|S f(x) dga (2) | < € (73) 
H 


for any interval [B’, B”] lying outside [—A, +A]. 
We fix any such interval [B’, B”] and write the obvious inequality 
B” B” p” p 
I$ F(x) dg (a) | < If F) dga (2) | +15 f(E) dg (x) — $ F (2) dgn (2) |. 
B B' B’ B 
In view of the remark on Theorem 2, an n can be chosen so large that the 
second term on the right-hand side is less than e. It now follows at once from 
(73) that 
Ba 
|f f(a) dg (2) | < 2e. 
B 


Since e is arbitrary, this inequality shows that the integral of f(x) with 
respect to g(x) over [— œ, -+ œ] exists. Formula (71) is proved by using precisely 
the same method as above, of dividing the interval [— œ, ++ œ] into three parts. 


13. Selection principle. We have already investigated a selection principle 
for sets of continuous functions [IV; 15 and 16]. We now prove a theorem which 
gives us a selection principle for functions of bounded variation. 

Turorem (Helly). Let E be a set of functions of bounded variation in the 
interval [a,b] (finite or infinite), where a positive number L existas such that, 
for all functions g(x) belonging to €, we have the inequalities : 


lg(z)|<L; Vé(g) <L, (74) 


i.e. all the g(x) are bounded in absolute value and their variations over [a, b] are 
also bounded by some number. Now, from any infinite sequence g,(x) of functions 
belonging to the set €, we can select a subsequence g,,(x) which tends to some func- 
tion of bounded variation g(x) at every point of [a, b]. 

We only need to prove the possibility of selecting a subsequence gp (x) which 
tends to a limit at every point of [a, b]. After this, it will follow at once from the 
conditions of the theorem, in view of what was said in [12], that the limit func- 
tion g(x) is of bounded variation. Let us prove a preliminary lemma. 

Lemma. If there exists a sequence of functiones h,(x), increasing in [a,b] 
and bounded by the same number L, we can extract from it a subsequence of functions 
having a limit at every point of [a, b]. 

We form the denumerable set of points x, (k = 1, 2, ...), contained in [a, b] 
and consisting of the left-hand end x = a and of all the points x that have rational 
abscissae. This set of points is dense in [a, b], and we can extract a subsequence 
hy,(2) which is convergent at all the points xg. We thus obtain a limit function 
h(x), as yet defined only at the points a and xy. We extend it as follows to the 
remaining points of [a, b]. If x is a point of [a, b] not belonging to the above- 


40 THE STIELTJES INTEGRAL [14 


mentioned set of points 2,, we shall take k(x) to be equal to the strict upper 
bound of the values of h(x) for all the x, lying to the left of x, i.e. we put 


h (x) = sup A (zp). 
xE< xX 


The A(x) thus constructed will obviously be increasing and bounded in [a, b). 
It can only have a finite or denumerable set of points of discontinuity: é, &, ... 
The sequence h,,(x) is convergent to h(x) at all the points æ of the set every- 
where dense in [a,b]. By Theorem 1 of [12], we also have convergence at all 
the points inside [a,b] where h(x) is continuous. The convergence of hn (x) 
to h(x) can therefore only break down at the points of discontinuity ¢, of 
h(a) and at the right hand end of the interval. On again applying the selection 
principle to the sequence h,,(z), we can arrange to have convergence at these 
points also (IV; 15], and the lemma is therefore proved. 

The fundamental Theorem 1 is now fairly easy to prove. Every function of 
bounded variation g(x) belonging to the set € can be written as a difference 
between increasing functions: 


sa =z [piat] -i To- (15) 


where, by (74), both these increasing functions have absolute values not exceed- 
ing L. We can say, on applying the lemma, that it is possible to extract from 
the sequence gn(x) a sequence for which the minuend on the right-hand side of 
(75) tends to a limit function at all points of {a, b]. On again applying the lemma, 
we can say that it is possible to extract from the sequence obtained a subsequence 
for which the subtrahend ‘on the right-hand side of (76) also tends to a limit 
function at every point of (a, b]. Thus we obtain a subsequence gn,(x) which 
tends to a limit function at every ‘point of [a, b], and the theorem is proved. 


14. Space of continuous fanctions. We consider the set of all functions 
taking real values and continuous in a given finite interval [a, b], and describe 
this set as a space C. An element (or vector) of this space is any function con- 
tinuous in [a, b]. Different continuous functions represent different elements of 
C. The function which is identically zero in [a,b] is called the zero element 
of C. If we form any finite linear combination of real functions continuous in 
[a, b], with real coefficients: ¢,f,(z) + ¢,f,(x) +... +¢mfm(z), we obtain a 
real function continuous in [a, b], i.e. elements of C can be multiplied by real 
numbers and added to give a further element of C. This operation is subject 
to the ordinary laws of elementary algebra, e.g. 


fi (Œ) + fa (£) = fa (Œ) + fi (£); effi) +h (2)] = ch, (2) + ef: (2); 
(6, + c2) f (x) = cf (x) + cof (x); Cy (cef (x)) = (c; C2) f (x). 

The concept of the norm of an element may be introduced into space C; 
in other words, we introduce the idea of the length of a vector in C. The norm 
of an eloment f(x) is defined as the maximum value taken by | f(x) | in [a, b]. 
The norm of the zero element is zero, and is positive for any other element. 


We shall use the notation || f || for the norm of the element f(x). Finally, the 
concept of convergence can be introduced into space C. We shall say that a 


14] SPAOB OF CONTINUOUS FUNCTIONS 4l 


sequence of elements f,(x) of C is convergent to the element f(x) of C if || f(z) — 
—f,() || — 0. This last is equivalent to the fact that the maximum value of 
| f(z) — f,(x) | tends to zero in the interval [a,b], which is in turn obviously 
equivalent to the fact that f,(x) — f(x) uniformly in [a, b]. 

We next introduce the concepts of functional and operator in space C. A func- 
tional in O is any definite law, in accordance with which any element f(x) of C 
is associated with a definite real number. The following notations are generally 
used for functionals: ®[f(x)], ¥[f(x)], ete. The concept of functional is a modifica- 
tion of the ordinary concept of function. In the case of a functional the role 
of argument is played by elements of C, whilst the value of the functional is 
a real number. A functional is said to be distributive if, given any finite linear 
combination of elements: c,f,(z) + c,f,{v) +... + ¢nfp(z), it satisfies the 
equation 


® [cy fy (£) + cof, (£) +--+. + cm Ím (2)] = 
= c1 [f, (x)] + c2 D [fe (z)] +... Om D [fm (x)]. (76) 


The functional ®[f(x)] is said to be bounded if there exists a positive number 
N such that 
|P [f (z)]| < N || F(z) II (77) 


for any element f(x) of C. 

The left-hand side of this inequality is the absolute value of a real number 
®[f(x)], which expresses the value of the functional in C for the element f(z), 
whilst the right-hand side is the product of the positive number N with the norm 
of element f(x), i.e. the product of N with the greatest value taken by | f(z) | 
in the interval (a, b]. Functionals that are distributive and bounded are described 
as linear. The concept of continuity of a functional may also be introduced: 
the functional ®[f(x)] is said to be continuous if the following condition is 
satisfied: if f,(x) — f(z) uniformly in [a,b], then @[f,(x)]—+ @[f(x)]. A linear 
functional may easily be seen to be continuous. For we can write, using (76) 
and (77): 


| D [f (x)] — D [fn (2)] | = | B [F (2) — fa lz) | < N || f (2) — fn (2) |l. 


In view of the uniform convergence of f,(x) to f(x), we have || f(x) — 
— fr(z) || — 0, so that D[f(x)] — O[f(x)] + 0, ie. in fact D[f,(x)] + D[f(x)]. 
In the definition of linearity, we could have required that the functional be 
additive and continuous, then proved that it is bounded, i.e. boundedness and 
continuity are equivalent in the case of a distributive functional. We shall not 
dwell on the proof of this, since it presents no difficulty. 

An example of a functional may be mentioned. Let x, be any fixed point 
of the interval [a,b]. The values f(x) of functions continuous at this point 
form a linear functional in C. The definite integral 


b 
$ F(z) da 


is also an example of a linear functional. Let g(x) be a function of bounded 


42 THE STIELTJES INTEGRAL [14 


variation in [a, b]. Given any element f(x) of C, we can form the Stieltjes integral 


b 
® [f (x)] = f f (x) dg (x). (78) 
a 


It represents a linear functional ®[f(x)]. It is distributive because the integral 
is distributive with respect to functions f(x), and it is bounded by virtue of the 
inequality 


b 
f Fæ) dg (x2) | < LV$ (9), 
a 








where L is the maximum of | f(x) | in [a, b]. Thus the role of the number N 
appearing in (77) can be played for the functional (80) by the total variation 
V39). 

Let us return to (77). If this holds for some choice of the positive number N, 
it holds all the more for larger values of N. Let us show that there is a least 
value of N for which (77) holds. 

We remark first of all that, if || f || = 0, f(x) is identically zero in [a, b], 
i.e. f(x) is the zero element of C if || f || = 0. For any other element, || f || > 0, 
but the zero element of C can be written as the product f,(z) = 0 ° f(x), where 
f(x) is any continuous function, It follows from (76) that ®(fa) = 0 > (f) = 0, 
i.e. any distributive functional vanishes on the zero element of C. For the zero 
element, therefore, inequality (77) has the form 0 < N *0, and it holds for 
any choice of N, i.e. we can discuss (77) only when || f || > 0. On taking (76) 
into account, we can rewrite (77) as 


eHe] <» (79) 


But p(x) = f(x)/|| f || is an element of C with unit norm. 
Let ng denote the strict upper bound of the non-negative numbers P(¢), 
where g(x) is any element of O with unit norm: 


ng = sup | P (p) |. (80) 
o= ipl =2 

It follows at once from (79) that ng is in fact the least possible value of N 
in (77). This number ng is called the norm of the functional ®(f). It is often also 
written as || ®|j. 

We have 


1B (f) | < ng (IFI (81) 


but (77) can no longer hold for all f(x) of C if we take N < ng. We remark further 
that ng > 0, and if ng = 0, it follows from (81) that (f) = 0 for any element 
f(x) of C, i.e. the functional Ø(f) associates the number zero with any element 
f(x). It follows from what has been said that the norm of a functional, given 
by (78), does not exceed vg). It is worth recalling that we considered in volume 
IV the space F of continuous functions, with a different definition of the norm 
of an element [IV; 35]. 


15] GENENAL FROM OF THE FUNOTIONAL IN SPAOE C 43 


15. General form of the functional in space O. We shall next prove Riesz’s 
important theorem, that any linear functional @(f) in C can be written in the 
form (78), where g(x) is a function of bounded variation defining this functional. 
But, as we have seen above, every integral (78), where g(x) is a fixed function 
of bounded variation, is expressible as a linear functional in O. We can therefore 
assert, after proving Riesz’s theorem, that integral (78), where g(x) is a function 
of bounded variation, represents the general form of linear functional in C. 

Tarorem (F. Riesz). Every linear functional in space C can be written in 
the form (78), where g(x) is a function of bounded variation. 

We shall use polynomials of a special type in the proof of this theorem; 
these polynomials were first introduced by Bernshtein, and have already been 
mentioned. Let us recall the construction and basic property of the polynomials. 
Let f(x) be a continuous function in the interval [0,1]. The Bernshtein poly- 
nomial corresponding to this function has the form 


n(n—1)...(n—m+1 
m! 


Pde) SI(q) omen a= ayn (o= ). (82) 


As we proved earlier [II, 154], on indefinite increase of n the sequence of 
polynomials P„(x) tends uniformly in the interval [0,1] to the function f(z). 
To prove our theorem, we transform the interval [a, b] into the interval [0,1] 
with the aid of the linear change of the independent variable: y = (x — a)/ 
(b — a). The space of functions continuous in [a,b] now becomes the space 
of functions continuous in [0,1], and we shall assume in the proof that the basic 
interval [a, b] is already [0,1]. Let Ø[f(x)] be a functional in space C. We have 
to show that it can be written in the form (78), where g(x) is a function of bounded 
variation in [0,1]. We have the obvious equation 


n 
> One” (L—a)?-™— ], 


m=0 


and all the terms of the sum written are non-negative if x belongs to [0,1]. 
Hence it follows that, if £m is a number equal to +1 or —1, the inequality 
holds: 


n 
|D emOn a]l (0<e<1). (83) 
m=0 


On applying the functional Ø[f(x)] to the polynomial on the left-hand side 
of (83), we obtain, by (77) and (83): 


n 
|D em (Cn a" (1 — x)" "]| < ng- (84) 
m=0 


We now choose the signs of the em so that the products ¢,,8[C, "(1 — 
— x)"-"] are non-negative for any m. With this choice of €m, (84) can be writ- 
ten as 


D |S [Or 2” (1 —2)""™] | < ng. (85) 


m=0 


44 THE STIELTJES INTEGRAL [15 


We subdivide [0,1] into n equal parts and define a function g„(x) such that 
it remains constant in each sub-interval, i.e. we define the function as follows: 





gn (0) =0 
gn (x) = P [Cp 2° (1 — x)"~*] for O<e<—, 
gn (x) = D [C3 2° (1 —2)"~°] + 8 [Ch (1 — n)" 
for ae <a< 2 , 
nm n 
: mm n—m 2 3 86 
Inl) = Olona (l — x) ] for pe R’ (86) 
n=l n—l 
Gn (2) = © P [Or 2" (1 — x)" "] for <z<l, 
m=0 
n 
ga (l) = D D [Or a" (1 —a2)"~™]. 
m=0 J 


The total variation of g,(x) is evidently equal to the sum of the absolute 
values of the jumps of g,,(x) at the points of subdivision and at the ends of 
the interval. By (85), we have V2(g,) < ng. Similarly, it follows at once from 
(85) and the definition of the function g,,(z) that | g,(x) |< Ne: Theorem 1 
of [13] can therefore be applied to the sequence of g,(x), and we can say that 
a sequence of increasing positive integers ną exists such that g,,(~) tends to 
a function g(x) of bounded variation at every point of [0,1]. We next show 
that this function g(x) in fact appears in the right-hand side of (78). We form 
the Stieltjes integral of f(x) with respect to g,(x). It is equal to the sum of the 
products of the values of f(z) at the points of discontinuity of g,(x) with the 
jumps at these points, ie. [6]: 


1 n 
™ m,m n—m 
fiano = D(F) elon (k—2)"-"]. 


By (76) and (82), the right-hand side is the value of the functional ®[f(x)]} 
for f(z) = P,(x), i.e. 


1 
Í f(x) dgn (2) = 2 [Pp (2)]. 
0 
We apply this formula for n = n,: 
1 
$ f(x) dg, (£) = D [P n (2)]- (87) 
0 


When. ng increases indefinitely, P,,(x) > f(x) uniformly in [0,1] and, since 
the functional ®[f(z)] is continuous, (87) gives 


1 
D[f(x)] = lim ff (x) dg,, (2). 


Nyeo 0 


15] GENERAL FORM OF THE FUNCTIONAL IN SPACE C 45 


We can apply Theorem 1 of [12] to the right-hand side, and we thus arrive 
at the formula 


1 
© [f (x)] = f f (x) dg (2). (88) 
0 


We show that ng = Vi(g). It follows at once from (88) that, as we have seen 
in [14], ng < Vi(g). On the other hand, it follows from the inequality Vi(gn,) < 
< no given above, that V1(g) < ng also. These two inequalities in fact lead to the 
equation ng =Vi(g). Let us consider whether form (88) for the functional 
(f) is unique. Let h(x) be a function of bounded variation, the values of which 
differ from the values of g(x) on some set ë of points of [a,b], where ¢ is a 
finite or denumerable set. It may readily be seen that 


1 
f f(x) dh (a) (89) 
0 


is equal to integral (88) for any choice of continuous function f(x). For, the set 
of points of any interval contained in [0, 1] has the power of a continuum, so 
that points not belonging to ë form a set dense in [0, 1]. Therefore, when form- 
ing the Riemann-Stieltjes sum for integral (88): 


= FE [9 (2) — 9 (te-1)] 
=] 


we can choose points not belonging to ¢ as the points of subdivision when the 
sub-intervals become indefinitely smaller, i.e. integrals (88) and (89) must be 
the same. 

Thus, on varying the function g(x) on a finite or denumerable set of points, 
but in such a way that the new function h(x) is of bounded variation, we obtain 
integral (89), yielding the same functional in C as integral (88). By what was 
said in [14], we can assert that ng < V1(h), where the < sign can in fact hold. 
We had ng = v9) for the g(z) constructed by the method used in proving the 
theorem. 

We now pose the following general problem: for what functions of bounded 
variation h(x) does integral (89) define the same linear functional in C as integral 
(88)? 

On introducing the new function of bounded variation w(x) = A(z) — g(x), 
we arrive at the following problem: for what functions of bounded variation 
w(x) have we 


1 
f f(x) daw (x) =0 (90) 
0 


for any function f(z) continuous in [0,1]? The next theorem supplies the answer. 

THEOREM. The necessary and sufficient condition for (90) to hold with any 
choice of continuous function f(x) is that the function of bounded variation w(x) 
satisfy the following: (1) w(x.) = w(0) at every interior point x = x, of [0,1] 
at which w(x) is continuous; (2) w(1) = w(0). 


46 THE STIELTJES INTEGRAL [15 


Necessity. We choose so large an n that 2, > l/n and the point x, + I/n 
lies inside [0, 1], and we define a continuous f(x) as follows: 
] for O<2<%, 


f@= — ng + (1+ n2,) for m<a<mtH, 


0 for +o <e. 


In the central interval [xo x) + 1/n], f(z) is a linear function, decreasing from 
unity to zero. With such an f(x), (90) gives 


Xe x+ 
f do(s) + f [— ne + (1+ na)] do (x)= 0 
0 . 
or $ 
“+t 
w (zo) — o (0) + f [—ne+(1+n2,)] do (x) = 0. (91) 
Xe 
But we have [9]: 
x+ 2 1 
+= 
| | [>ne + (1+ na,)] de (z) | < 1-08" n (w) 
Xe 9 


Since œ(x) is continuous at x = x, by hypothesis, we can say that V¥8tY(%) —— 
— 0 on indefinite increase of n, and (91) gives in the limit w(x,) = w(0). The 
second condition w(1) = w(0) is obtained from (90) if we put f(z) = 1. 

Sufficiency. Since the points at which a function of bounded variation is 
continuous are distributed densely in [0,1], we can make use of these points 
only in forming the Riemann-Stieltjes sum. But now, in view of the conditions 
wlx) = w(0) = w(1), all the differences w(x) ~ w(x;,_,) vanish, so that (90) 
holds for any choice of continuous f(z). The theorem is proved. 

We have thus shown that the necessary and sufficient condition for integral 
(89) to give the same linear functional in C as (88) is that the difference 
h(x) — g(x) be equal to h(0) — g(0) at every point at which it is continuous, as 
also at x = 1. 

Let us return to (88). As we know, ng < Vi(h) for any form (88) of Ø(f). 
We have ng =Vi(g) in (88). Further, a constant can evidently be added to 
g(x). We shall eliminate this many-valuedness by assuming g(0) = 0 in (88). 
The function g(x) can only have a finite or denumerable set of points of dis- 
continuity. Let us consider a point of discontinuity é such that g(¢ — 0) = 
= g(¢ + 0), but g(é) Æ 9(€ — 0) (a removable discontinuity). If we change 
g(x) at the single point x = & by putting g(é) = g(& — 0), we remove this 
discontinuity without changing integral (88), though with a decrease in V7%(g). 
But this is an impossibility, since we has earlier Vi(g) = ng, whilst after the 
change described we should have the impossible Vi(g) < ng. Hence it follows 
from Vi(g) = ng that g(x) has no removable discontinuities. There remain the 
discontinuities at which g(& — 0) Æ g(é + 0). If we take g(é), first as belonging 


16] LINEAR OPERATORS IN € 47 


to the closed interval ¿ with ends g(¢ — 0), g(& + 0), then as lying outside this 
interval, V1(g) will obviously be greater in the second case than in the first. 
Hence it follows from Vi(g) = ng that, at irremovable discontinuities, g(é) 
belongs to the closed interval 7. The position of the number g(é) in the interval 
i has no effect on the total variation V2(g). Thus, given V1(g) = ng, the defini- 
tion of g(é) at the points of discontinuity is the only arbitrary feature, though 
g(&) must belong to 7, We usually put 


§—0) +9 (€ +9) 
2 





9 (6) = 4 
and either g(é) = g(é + 0) is a continuity from the right, or g(#) = g(é — 0) 
is a continuity from the left. 

A similar treatment to the above can be given of the space C of complex 
functions g(x) + iy(x), continuous in the interval [0,1]. The elements of this 
space can be multiplied by complex numbers and added. The definitions of 
norm and linear functional are retained, but the values of a functional may be 
real or complex. The theorem on the general form of the linear functional holds, 
where complex functions of bounded variation have the form A(x) + iu(x), 
where A(z) and u(x) are real functions of bounded variation. 


16. Linear operators in C. An operator in C is any definite law in accordance 
with which any element f(x) of C is associated with a definite element 
p(x), also of C. 

We bring in the notation F[f(z)] for the operator. Given a function f(x) 
of O, the symbol F[f(x)] defines some function p(x) also of C. The definition of 
a distributive operator is like the definition of distributive functional, i.e. 
a formula analogous to (89) is used. A bounded operator is defined by a formula 
similar to (88), except that the absolute value on the left-hand side must be 
replaced by the norm, since F[f(x)] is an element of C and not a number: 


| F Cf (#)] || N Il æ) |l- (88,) 


A distributive and bounded operator is said to be linear. Such an operator 
is necessarily continuous, i.e. if f(x) — f(x), then F[f,(x)] + F[f(x)], the con- 
vergence in both cases being the uniform convergence of the corresponding 
sequence of functions in fa, b]. 

We shall state without proof the fundamental result on the general form of 
a linear operator in C. Let g(x,y) be a function defined in the closed two- 
dimensional interval 0 < z < 1, O< y < 1, and of bounded variation with 
respect to x in the interval [0,1] for any value of y in the interval [0,1]. On 
substituting this function g(x, y) in the right-hand side of (96), instead of a 
number we obtain a function of the parameter y, defined in the interval [0,1]: 


1 
g (y) = Sf (x) dg (x, y). 
0 


The necessary and sufficient condition for this last formula to yield a linear 
operator is that g(x, y) has, in addition to the above properties, the further 
property that the function p(y) defined by the last formula be continuous in 


48 THE STIELTJES INTEGRAL [17 


[0, 1] for any choice of f(x), continuous in [0,1]. If y, is a value from [0,1], 
and yn (n = l, 2,3, ...) is a sequence of numbers from [0,1] having y, as 
its limit, the definition of continuity for p(y) at y, leads at once to the following 
necessary condition that must be satisfied by g(x, y): given any y, of [0,1] 
and any f(x) continuous in this interval, we must have 


1 1 
lim St (x) dg (2, Yn) = Sf (x) dxg (£, Yo); (92) 
Yayo 0 0 


where yn is any sequence of numbers of [0, 1] having y, as its limit. A function 
g(x, y) that has these properties is usually said to be weakly continuous with 
respect to the parameter y. If g(x, y) is of bounded variation with respect to 
x and weakly continuous with respect to y, (92) obviously yields a linear operator 
in O. The converse can be proved, i.e. every linear oeprator in C is expressible 
by (92), where g(x, y) is of bounded variation with respect to x and weakly 
continuous with respect to y. The proof of this theorem and a discussion of the 
concept of weak continuity may be found in V. I. Glivenko’s book The Stieltjes 
Integral. 

If K(x, y) is a function continuous in the two-dimensional interval 0 < 
<2<1,0< y< 1, the formula 


1 
p (y) = Í E (y, x) f (a) dx 
0 


evidently gives a linear operator in C. We encountered these operators in the 
theory of integral equations. However, not every operator in C can be written 
in this form. 


17. Functions of an interval. In future generalized concepts of the 
integral it will be more convenient for us to utilize functions of an 
interval instead of functions of a point. Let g(x) be a given non- 
decreasing bounded function on the infinite axis (—oo, +2). We 
associate with any interval A = (a, 8] semi-open from the left a non- 
negative number: g(f + 0) — g(a + 0) (the mass contained in this 
interval). Hence we obtain a function of a semi-open interval, which 
we shall write as G(4): 


G (4) =9(8 + 0) —g(a+ 0). (93) 


To formulate the properties of this function we must introduce a 
new concept. We shall say that a sequence A”, A®, ... of semi-open 
intervals is vanishing if every interval 4“+” belongs to the previous 
interval 4“, and there is no point common to all the intervals. Let us 
explain the structure of a vanishing sequence of intervals. Let 4 be 
(ax, bk]. By hypothesis, ak < p41, br < bk41 and bp —a,—> 0 on in- 
definite increase of k. The monotonic sequences a; and by have a com- 


17] FUNCTIONS OF AN INTERVAL 49 


mon limit c, where a; < c < bx for any k. Since there is no common 
point (in particular, c) of all the intervals, the point c must fall outside 
the left-hand open end of the interval for all sufficiently large k, i.e. 
A“ is (c, bx] for all sufficiently large k and by > c. Now: 


G (4) = g (by + 0) — g (c + 0) 0, 


since g(b, + 0) — g(c + 0). Definition (93) and the arguments just 
adduced lead at once to the following three fundamental properties 
of GA): 

(1) G(A) is non-negative 

(2) it is additive, i.e. if the semi-open interval 4 is split into a finite 
number of semi-open intervals 4), 4,,..., Áp no pair of which has 
common points, this being usually expressed by the equation 


A=A, + 4,4+--:+ Ay (94) 
then 
q 
G (4) = 2 e (Ans (95) 


(3) the function G(A) tends to zero in a vanishing sequence of inter- 
vals. 

This last will be described as the property of normality of G(A). 
It has an obvious physical meaning. We shall discuss any intervals: 
(a, B), (a, 8), [a, 8], (a, 8), and not merely those semi-open on the left: 
also, a separate point a will be regarded as the interval [a,a]. By 
starting from a non-decreasing bounded function of a point g(x), we 
can construct a function G(A) of any type of interval, where the func- 
tion possesses the three basic properties given above. All we have to do 
is introduce, in addition to (93), the following further definitions: 


G((a, B)) = g (B — 0) — g(a — 9); 
G (La, B]) = g (B + 0) — g(a — 9); (96) 
G ((a, B)) = g (B — 0) — g (a + 0), 
or, if [a] is an interval consisting of a single point, then 
G ([a]) =g (a + 0) — g (a — 0). (97) 


If the non-decreasing function g(x) is defined only in a finite interval, 
say in(a,b],we can extend it to the entire axis by putting g(z)=g(a+ 0) 
for z < a and g(x) = g(b) for x > b. 

We have arrived at the concept of a function of an interval G(4) 
by starting out from a non-decreasing function of a point g(x). Con- 


50 THE STIELTJES INTEGRAL [18 


versely, given an interval function with the above three properties, 
we can form a point function g(x), which leads to G(A) in accordance 
with the above scheme. For this, it is sufficient to put 


g (£) = G ([ — œ, 2). (98) 


The function g(x) thus formed is obviously continuous from the 
right. If G(4) is defined only for intervals belonging to some interval 
Ay, it can be defined for all intervals by putting G(4) = G(A - A,), 
where the product of intervals 4 - 4, is defined as the interval con- 
sisting of points that belong simultaneously to 4 and 4). If there are 
no such points, i.e. 4- 4, is the empty set, we naturally put G(4' 4,) = 
= 0. Notice that, if a constant is added to g(x), this has no effect on 
the value of G(4). If g(x) is a function of bounded variation, we make 
use of its canonical form as the difference between two decreasing 
functions: g(x) = g(x) — g(x). The functions g,(z) and g,(z) lead us to 
interval functions G,(4) and G,(4) with the three properties men- 
tioned, whilst the function g(x) gives us the function G(A)=G,(4)—G,(A). 
This function G(4) can be formed directly from g(x) in accordance with 
(93), (96) and (97). It is additive and normal, but may be negative. 


If we had the closed interval [— œ, +œ}, we should have to put, instead of 
(98): 
g (x) = G ((— œ)) + G ([— œ, z]). (99) 


18. The general Stieltjes integral. We now turn to a generalized type of 
Stieltjes integral. As already remarked in [3], we obtain such a generalization 
if we require only that the numbers i and J defined in [3] be coincident. Further, 
we shall consider integration over an interval 4 of any type, and shall split A 
into sub-intervals 4, of any type, having no common points, either inside or 
at the ends. Naturally, an individual point is considered admissible as a sub- 
interval. 

Suppose, then, that we have bounded functions f(x) and g(x) given on a finite 
or infinite interval A, where g(x) is non-decreasing. We divide 4 into sub- 
intervals 4, (k = 1,2, ..., p) of any type with no common points. Let mp 
and Mp be the strict lower and strict upper bounds of f(x) in 4; and let & 
be any point of A. Together with g(x), we bring in the interval function G( 4y), 
defined in [17], and we form the sums: 


P p p 
o= X mG (di) Sg= X MG (4g); o= X i lE) G (Ay) (100) 
k=l k=1 k=1 


Some facts must be mentioned in connection with our use of intervals of 
any type. If a point P with abscissa x comes in as an independent element 
of the subdivision of A, the corresponding term in each of sums (100) is the same 


18] THE GENERAL STIELTJES INTEGRAL 51 


and has the form f(x)[g(z + 0) — g(x — 0)]. There is no sense in bringing in 
points where g(x) is continuous as independent elements of the subdivision, and 
it can be assumed in future that, if a point is an independent element of the 
subdivision, it is a point of discontinuity of g(x). 

If A’ and A’ are two intervals, their product 4’ A” is defined as the set of points 
which belong simultaneously to 4’ and 4”. This is again an interval, or is the 
empty set. Let 6’ and 6” be two subdivisions of 4. The product of subdivisions 
6’ and ô” is defined as the subdivision 6’6” consisting of all possible intervals 
d'A", where 4’ belongs to 6’ and A” belongs to ô”. Pairs of the 4’A” obviously 
have no common points, whilst their sum gives the basic interval 4. The sub- 
division 6’ is called an extension of the subdivision ô if every element of ô’ lies 
wholly in one of the elements of the subdivision 6. The product 6’6” is an exten- 
sion both of 6’ and of 6”. If the interval 4 is closed on the right and x = b 
is the right-hand end of A, we have to assume g(b -+ 0) = g(b) in the defini- 
tion of G(4). Similarly, at the left-hand end g(a — 0) = g(a). This also refers 
to the case when b = + œ ora = — œ. As in [3], we write ¢ for the strict upper 
bound of sums s, and J for the strict lower bound of sums S, for all possible 
laws of subdivision. Everything that we said in [3] holds for s,, S,, 04,7 and I. 

DEFINITION. We shall say that f(x) is integrable with respect to g(x) (or 
G(A)) ifi = I, and we take i as the value of the integral: 


i = f f (£) dg (Œ) = $ f(x) G (dd). (101) 
4 4 


The integral thus defined is called the general Stieltjes integral. 

The integral defined in [2] will be called simply the Stieltjes integral, or the 
original Stieltjes integral, in order to distinguish it from the general integral. 
We now give the conditions for the existence and the properties of the general 
integral. 

THEOREM 1. The necessary and sufficient condition for the existence of integral 
(101) és that there exist a sequence of subdivisions 6, (n = 1, 2, ...) such that the 
difference Ss, — 83, tends to zero, or, what amounts to the same thing, Os, has 
a definite limit for any choice of points &. This limit A in fact gives the value 
of the integral. Now, s, — A and Sa, — A. 

This theorem follows directly from [3], the sub-intervals 4, having no com- 
mon points in the present case. It should be mentioned that the sequence of 
subdivisions 6, mentioned in the theorem need not necessarily be a sequence 
with the sub-intervals becoming indefinitely smaller. If, for instance, there 
exists a subdivision ô such that a} does not depend on the choice of points $x, 
we can take all the ô, as coinciding with ô. 

DEFINITION. A sequence of subdivisions ôn is said to be regular for the function 
g(x) (or G( A) ) of the following two conditions are fulfilled: (1) every point of disconti- 
nuity of g(x) appears as an independent element of subdivision for all 6, as from 
a certain n; (2) the sub-intervals of ôn become indefinitely smaller on increase of 
n, this indefinite refinement having the meaning indicated in [4] in the case of 
an infinite interval A, 

THEOREM 2. If 6, is a regular sequence of subdivisions, 8, > + and Sa, > I 
for any choice of bounded function f(x). 


52 THE STIELTJES INTEGRAL [18 


Let ô be any given subdivision of 4 and 6, = 66,. We shall take n so large 
that, firstly, all the points of discontinuity of g(x) which are points of the sub- 
division ô have already appeared as independent elements of subdivision 6,, 
and secondly, every sub-interval of ô, contains not more than one point of the 
subdivision 6. It has to be borne in mind in future that the number of such 
points is fixed. Let us consider the difference sy — 2; . The terms in the sums 
83, and s4, that correspond to the points of ô which are points of discontinuity 
of g(x) are the same, since, by what has been said above, they are independent 
elements of subdivision in 6, and hence in ô}. In future, we shall only speak 
of the points of ô at which g(x) is continuous. Let the number of these be q. 
If a sub-interval of subdivision 6, does not contain such a point of subdivision 
in its interior, the terms in 84, and sy, corresponding to it are the same. There 
remain not more than g sub-intervals of 6, which contain these points of 
ô as interior points. Let x = x’ be a point of 6 lying inside the sub-interval 
A) of ôn. On passing from 6, to ôn, a term of the form A”) G(A™) is replaced 
by two terms of the form p™ GA) + s) GIA), where A” contains x = x’ 
as an interior point, and A”) and A") contain it at the ends (precisely where 
x’ is reckoned to be is of no importance, since g(x) is continuous at x = 2’). 
The AM, uf, denote numbers whose absolute values do not exceed the 
strict upper bound of | f(x) | in A. 

Since the sub-intervals become indefinitely finer as n increases in a regular 
sequence, this is true for A”), A”), AM, and since these intervals contain 
a fixed point x = x’, at which g(x) is continuous, either as an interior point 
or as an end, we can assert that 


IMG (4) +0 and wa (4m) +E (44) +0 when n> oo. 


On recalling also that the number of points x’ does not exceed g, we can say 
that 35, — 8s, > 0 as n— oœ. 

On the other hand, we have the inequalities 84, < sy < i and 8; < 8,1 < i. 
We can choose the subdivision ô such that 3, is as close as desired to 3, i.e. 
given any positive £, we can choose & ô such that è — s; < €. It now follows at 
once from s4 < 8s, < 7 that i — s5, < e, and finally, the result obtained above: 
85, — 35, — 0 shows us that, for all sufficiently large n, we have i — s3, < 2e, 
i.e. since ¢ is arbitrary, 8&4, — ¢, which is what we wanted to prove. It can be 
shown in precisely the same way that S; — I. Notice that, if g(x) is con- 
tinuous, the only characteristic feature of a regular sequence is that its sub- 
intervals become indefinitely finer. This is the case, for instance, with the 
Riemann integral, where g(x) = x. Since x is unbounded in an infinite interval, 
when forming the proper Riemann integral wə were forced to consider only 
bounded intervals. 

THEOREM 3. If integral (101) exists and is equal to A, o; — A for any regular 
sequence of subdivisions. It follows at once from the hypothesis that i = J = A. 
With this, s4, ~ 4 and S, + A by virtue of the theorem, so that os, which 
satisfies s4, < 03, < Ss, all the more tends to A for any choice of points am. 

The following corollary is a direct consequence of the above theorems. 
If the limit o,, + A exists for some regular sequence 6, it likewise existis for any 
other regular sequence and is also equal to A. The integral now exists, and is equal 
to A. If o4, has no definite limit for some regular sequence of ôn, integral (101) 


19] PROPERTIES OF THE (GENERAL) STIELTJES INTEGRAL 53 


does not exist. Hence, the use of sums g; with regular sequences of subdivisions 
directly solves the problem of the existence of the basic Stieltjes integral. 

A further remark: if 6, is a regular sequence for g(x) and 6, is an extension 
of ôn, then ô, is also a regular sequence. This follows at once from the definition 
of regular sequence. 


19. Properties of the (general) Stieltjes integral. As we have seen, the general 
Stieltjes integral can be obtained as the limit of sums g for some choice of 
sequence of subdivisions. Now, if o5, have a definite limit, and ô, is an extension 
of ôn, the 03, also have the same limit. We can therefore prove the properties 
of the general Stieltjes integral by using the sums g5, just as we did for the 
Riemann integral and the original Stieltjes integral. We shall give these pro- 
perties with some corollaries. Throughout what follows, the functions f(x) 
and g(x) are assumed bounded in the interval of integration, and in addition, 
g(x) is assumed non-decreasing. 


I. If c are constants, we have 


S S orte (2) dg (2) = S or $ fe (2) dg (2), (102) 


á k=l k=l 4 


where the existence of the integrals on the right implies the existence 
of the integrals on the left. 


This is proved simply by taking any regular sequence of ô, for the function 
glz). 


II. If g(x) (k = 1, 2, ..., p) are non-decreasing bounded functions 
and cx are positive constants, we have 


f Hla a(S ergele) = Ser f F(a) Agu (a (108) 
o k=l Ay 

where the existence of the integrals on the right implies the existence 

of the integrals on the left, and vice versa. 


Let ôl% be a regular sequence of subdivisions for g,(x). 
The sequence 6, = 6) 62) ... óP will be a regular sequence for all the 
x(x) (k = 1, 2, ..., p). We have the obvious equation 


P 
Sa, — So, = 2 Cy (sf = s) í (104) 


where 8;, and S, refer to the integral on the left-hand side of (103), whilst 
83,(k) and S,,(k) refer to the integrals on the right-hand side. If the integrals 
on the right-hand side of (103) exist, S;(k) — 83,(k) + 0 (k = 1, 2,..., p), 
and consequently Ss, — S6, > 0, ie. the integral on the left-hand side exists. 
Now suppose that this latter integral exists. There must now exist a sequence 


54 THE STIELTJES INTEGRAL [19 


of subdivisions 6, such that S4, — 85, — 0. The terms on the right-hand side 
of (104) are non-negative, so that Sa,(k)— 8a,(k)— 0 for k = 1,2, ..., p, 
i.e. the integrals on the right-hand side of (103) exist. Formula (103) itself 
follows at once from the analogous property for the finite sums o,, and a passage 
to the limit. 


III. If the interval 4, is divided into a finite number of intervals 
A,, 4,,..-, 4m, with no common points, then 


f f(@) dg (x) = È ff (105) 


=I k 


where the existence of the integrals on the right implies the existence 
of the integrals on the left, and vice versa. 


Suppose that the integral on the left exists. By Theorem 1, a sequence ôn 
of subdivisions of A exists such that S4, — 83, — 0. Let 6 denote the subdivision 
of A into A, (k = 1, 2,...,m) and let ô, = 6,6. We obviously have Ss, — 
85, — 0, since 847 > 84, and Sz, < Sa, Tho sum of the form (16) that represents 
Ss, — Sa, can be split into m non-negative terms, each of which represents 
the corresponding sum for some 4p, and, since all the sums tend to zero, we can 
say that this is true for the individual sums, i.e. the integrals on the right of 
(105) exist. Conversely, let the integrals on the right exist. For each of them, 
there exists a sequence of subdivisions 6) with which the difference S,» — 
— ss} tends to zero. The product of these sequences of subdivisions leads at 
once to a sequence of subdivisions of the entire interval 4 with which the 
corresponding difference for the integral on the left also tends to zero. Notice 
that, for the original Stieltjes integral, in the third of formulae (3), the existence 
of the integrals on the right does not imply the existence of the integral on the 
left. 


IV. If we have | f(x) | < Lin the interval A, then 
| J Fie) dg (x) | < La (4), (106) 
4 


where G(A) is the total increment of the function g(x) in the interval 4, 
on the assumption that the integral exists. 

V. If the functions f(x) tend uniformly to f(x) in the interval 4 as 
p — œ, and the integrals of f (x) with respect to g(x) exist, the integral 
of f(x) with respect to g(x) also exists, and we have 


lim f f, (£) dg (x) = jf) (107) 

pro A 
Let AM (k = 1,2, ...,t,) be the intervals of subdivision of some regular 
sequence of subdivisions 6, for the function g(x). We consider the sums 0, (p) 
and os, for the functions f,(x) and for f(x), which is evidently bounded in view 


19] PROPERTIES OF THE (GENERAL) STIELTJES INTEGRAL 55 


of the uniform convergence of the fp(x): 
ta i, 
of) = = tp (EP) G (42); o, = = 1 (8) G (4). (108) 


The points ein) are taken to be the same in all the sums. We form the dif- 
ference 


tn 
04, — Of) = = LE) — fp EP) 0 (4). 


Given any positive e, in view of the uniform convergence of the f,(x) there 
exists an N such that | f(x) — f(x) | < £ for p > N and for any x of A,. Given 
p >N, and any choice of ED, we have the following inequality: | o5, — g9 | < 
< eG@(A). It is clear from this that o» tends to o5, as p — oo, the convergence 
being uniform with respect to n and the choice of points E. By hypothesis, 
the f(x) are integrable with respect to g(x), so that each of the sums of) has 
a limit as n increases indefinitely: let the limit be Ap. This limit is in fact the 
integral of f,(x) with respect to g(x). We now show that the sequence of numbers 
Ap has a limit. 

We have by (106): 

| Ap — 44] < G (4) max | fy (x) — fy (24 |- 
By hypothesis, the right-hand side tends to zero as p, g — œ, so that Ap 


has a limit as p — œ, which we write as A. It remains for us to show that o3, + 
— Aasn-— oo. Let us consider the difference A — o,,, which we can write as 


A — 04, = (A — Ap) + (Ap — o) + (of?) — 04). (109) 


Given an € > 0, we first choose p so large that | A — Ap | < £ and | of?) — 
— Os, | < £ for any n and &. 

Further, for all sufficiently large n, we have | Ap — of?) | <e for the p 
fixed above. Therefore | A — Gs, | < 3e, whence it follows, since e is arbitrary, 
that og, >— A. 


VI. If the integral of f(x) with respect to g(x) exists, the integral of 
| f(z) | with respect to g(x) also exists, and we have 


f fla)dg (2) <1 \f (a) | dg (a). (110) 





We bring in the usual notation m, and M, for f(x). If both numbers are posi- 
tive, we have the same strict bounds for | f(x) |. If both are negative, the strict 
lower bound and strict upper bounds for | f(z) | will be | My | and | my |, and 
the difference between the strict upper and strict lower bounds remains as before. 
Finally, if mą is negative, and M, positive, the strict upper bound of | f(x) | 
will be the greater of the numbers | my | and Mg, whilst the strict lower bound 
will be a number > 0. Thus the difference between the strict upper and strict 
lower bounds of | f(x) | will never be greater than this difference for f(x). Hence, 
if the difference S;, — 35, tends to zero for f(x) for some sequence of subdivisions, 
it will all the more tend to zero for | f(x) | for the same sequence of subdivisions, 


56 THE STIELTJES INTEGRAL [20 


i.e. the existence of the integral of f(z) implies the existence of the integral of 
| f(x) |. Inequality (110) is obtained at once from the corresponding inequality 
for the sums by passage to the limit. 


VII. If the functions /,(z) and f,(x) are integrable with respect to 
g(x), their product /,(z)f,(z) is integrable with respect to g(x). 


We first show that, if f(x) is integrable with respect to g(x), then f?(x) is inte- 
grable with respect to g(x). We shall assume for the moment that f(z) is positive, 
and form the sums o; for f(x) and for f(z): 


n 


03 = > (My, — m) G (4,), 


oy = Šor- mk) G (44) = Son + my) (My ~ my) G (4;)- 


If the first of these tends to zero for some sequence of subdivisions, the second 
will also tend to zero for the same sequence, since the factor My + mg is bounded. 
Therefore, given a positive function f(x), the integrability of f(z) implies the 
integrability of f?(x). If f(x) is non-positive, there exists, in view of its bounded- 
ness, a positive constant such that f(x) + a is positive. This latter function is 
clearly integrable by property J, so that, by what has been proved, [f(x) + a}= 
= f%(x) + 2af(x) + a? is also integrable, whence it follows at once that x 
is integrable, this being expressible as a sum of integrable functions: f?(x) = 

= [f(x) + a}? — 2af(z) — a?. Finally, to prove that Ail) f(x) is integrable, 
we only need to write it as 


h (2) hle) => th @) + @OP— > R@)— 5 Ae). 


The right-hand side is a sum of integrable functions. 


20. The existence of the general Stieltjes integral. Some sufficient conditions 
for the existence of the general Stieltjes integral are given below. 

THEOREM. Any bounded function f(x) is integrable with respect to the jump 
function g(x) in the sense of the general Stieltjes integral. 

Suppose first that the set c}, Ca, ..., Cp Of points of discontinuity of g(x) is 
finite, and let 6 be a subdivision of the basic interval, defined as follows: the 
ends of the interval, if they belong to the domain of integration, and the points 
Ci» Cz, -+ +, Cp are independent elements of ô, whilst the remaining elements are 
the open intervals which are obtained after extracting the above-mentioned 
points. In each of these intervals g,(z) retains a constant value, and the 
sums corresponding to the integral of f(z) with respect to g(x) are obviously 
the same and are given by 


p 
sa = So = >t (cx) Ve: (111) 


k=l 


20] THE EXISTENCE OF THE GENERAL STIELTJES INTEGRAL 57 


The integral of f(x) with respect to g,(x) therefore exists and is given by (111). 
Now let the set of points of discontinuity cp of g(x) be infinite. Let 4, be a sub- 
division of the basic interval carried out in the same way as above, the ends 
of the interval and the first n points of discontinuity c, Cz, ...,¢, being taken 
as independent elements of the subdivision. We form the sums g,, for this 
sequence of 6,. The sum of the terms of oj, that come from the independent ele- 
ments of the subdivision is equal to 


2 Í (ck) Yk 


and does not depend on the variable points &,. Let us consider one of the open 
intervals (a, $) appearing in the subdivision 6,. The term of the sum g,, Cor- 
responding to it will have the form 


Í (E) Lg (B — 9) — g (a + 9)] (a< E< B). 
If | f(x) | < L, we have 
17 (8) | Eg (B — 0) —g(a+0)] < Lg (8 — 0) — g (a +9)], 


the difference g(8 — 0) — g(a + 0) being the sum of the jumps of g(x) inside 
the interval (a, 8). The sum of the terms in g, corresponding to open intervals 
of the subdivision 6, will therefore have an absolute value not greater than the 
product of L with the sum of all the jumps of g(x), except the jumps at the 
points ¢,,¢:,...,C,. This sum tends to zero on indefinite increase of n, whilst 
the sum corresponding to the c, gives in the limit the sum of the convergent 
series 


D f(x) Ye» (112) 
k=1 


i.e. the integral of f(x) with respect to g(x) exists and is given by series (112), 
which is what we had to prove. 

Thus the question of the existence of the general Stieltjes integral reduces 
to the question of the existence of the integral of f(x) with respect to the con- 
tinuous non-decreasing function g,(x). For this latter function, any sequence 
with indefinitely diminishing sub-intervals will be a regular sequence. If f(x) 
is monotonic in the interval 4, by the formula for integration by parts [2], 
it is integrable over A with respect to g(x). Therefore, any monotonic function 
is integrable with respect to any increasing bounded function. 

Let us indicate a further class of functions integrable with respect to any 
increasing bounded function g(x). We shall say that f(z) is piecewise constant 
in A if this interval can be divided into a finite number of intervals A, 4), 
..., 4”, pairs of which have no common points, such that f(x) is constant 
in each A“ (k = 1, 2, ..., m). We denote the constants by the letter by. It is 
easily seen that a piecewise constant function is integrable with respect to any 
increasing function g(x). For, if ô is a subdivision of A into 4 (k = 1, 2, ...,m), 
the sums s; and S; are the same: 


m 
8 = Sy = > b0 (4) (113) 
k=1 


58 THE STIELTJES INTEGRAL [21 


On repeating this subdivision ô, we can seo that the integral of the piece- 
wise constant function f(x) with respect to g(x) exists and is given by the sum 
(113). On applying property V of [19], we see that, if the function f(z) is the 
limit of a uniformly convergent sequence of piecewise constant functions in 
A, f(x) is integrable with respect to any increasing bounded function g(z). 

It is worth remarking further that the necessary and sufficient condition for 
integrability of f(x) with respect to g,(z) is the same as was indicated in [10]. 
We shall not dwell on the proof. 

Note 1. In [8] above, we defined a function of bounded variation g(x) 
in a closed interval [a, b]. This concept can be introduced similarly in intervals 
of a different type. Let us take say the open interval (a, b). We say that g(x) 
is a function of bounded variation in (a, 6) if it is of bounded variation in any 
closed interval [c, x] lying inside (a, b), and V%(g) is not greater than a certain 
number for any such interval. As c diminishes, the value of V*(g) does not dimi- 
nish, so that it has a finite limit as c tends to a, which we denote by V%,,(g). 
As x tends to b, it again has a finite limit, which we call the total variation of 
g(x) in (a, b). Formulae (44) define the functions g,(x) and g,(x), which appear 
in the canonical form of g(x) as the difference between two non-decreasing bounded 
functions: g(x) = g,(~) — g(x). The general Stieltjes integral is then defined as 
the difference between the integrals with respect to the non-decreasing functions 
gi(z) and g(x): 


J Fle) dg (a) = ff @) Aan (a) He ) dg; (2), 


and the existence of the integral of f(x) with respect to g(x) follows from the 
existence of the integrals of f(x) with respect to g,(x) and g,(x). 

On taking into account what has been said above, we can assert that, in 
the sense of the general Stieltjes integral, a function of bounded variation and 
a function which is the limit of a uniformly convergent sequence of piecewise 
constant functions are integrable with respect to any function of bounded variation. 

Note 2. A further concept of integral is sometimes introduced when 
integrating over a semi-open interval (a, b}; this differs from the general Stieltjes 
integral only in dividing the basic interval (a, b] into semi-open intervals of the 
type (c, d] with no common points, instead of dividing it into intervals of any 
type. Since the sets of numbers ¢, and S, can now only be part of the sets of 
the same numbers for the general integral, the number 7 can diminish, whilst 
the number I is increased. This implies that, if we had 1 = I for the general 
integral, we can have ¿ < J for the new form of integral, i.e. a function integrable 
in the sense of the general integral may be non-integrable with the new defini- 
tion of the integral. 


21, Functions of a two-dimensional interval. The concept of additive 
function of an interval and the constructions of the Stieltjes integral 
become more complicated on a plane, in three-dimensional space, and 
in general in n-dimensional space. Let us take the case of a plane. 
The arguments that follow readily suggest the modifications needed 
for spaces with a larger number of dimensions. Given a plane with X 


21] FUNCTIONS OF A TWO-DIMENSIONAL INTERVAL 59 


and Y axes, suppose we have an interval 4, on the X axis and an inter- 
val 4, on the Y axis, the intervals being understood in the general 
sense as discussed in [20]. These intervals 4, and 4, define an interval 
4 on the plane, i.e. a point (x, y) is reckoned to belong to 4 if x belongs 
to 4, and y to 4. There can be many varied types of plane interval 
(domain). For instance, an interval may be closed, and given by the in- 
equalities a < x < b, c < y < d; or it may be semi-open, given by 
a<2<b,c<y< d;orsemi-open, givenbya<a<bc<y<d; 
or semi-open, given by a<2z<b, c<y<d; or it may be a seg- 
ment of straight line parallel to the X axis: a < x< b, y=c, or 
the point z=a,y=c, and so on. The numbers a, b, c, d appearing 
in the above inequalities may be finite or infinite. Semi-open intervals 
given by inequalities of the form a < x < b, c < y < d will be of 
great importance to us later, and for the sake of brevity only intervals 
of this type will be described in future as semi-open. 

Suppose that a non-negative function G(4), having additive and 
normal properties, is defined for an interval A belonging to the 
basic interval A,. In other words, if A is the sum of intervals 4® + 
+ 4® +... + 4, with no common points, then 


@(4) = G4), 


k=1 


and if 4,, 4,,... is a vanishing sequence of intervals, G(4,) — 0. 

We shall describe some properties of such functions. The fact that 
G(4) is non-negative and additive implies at once that G(4,) is the 
greatest value of G(4) for 4 belonging to 4). We shall simply write 
G(P) for the value of G(4) when 4 is an individual point P; we have 
G(P) > 0 since G(4) is non-negative. It may happen that G(P) = 0 at 
every point P belonging to 4). In this case G(4) is said to be continuous 
in A. If G(P) > 0, P is called a point of discontinuity of G(4). It may 
easily be shown that the set of points of discontinuity is finite or de- 
numerable [cf. 6]. Let us consider the points of discontinuity at which 
G(P) > 1. Since G(4) is non-negative and additive, the number of 
such points is not greater than the integral part of the number 
G(A,). Similarly, the number of points P at which G(P) > 1/2 is not 
greater than the integral parts of 2G(4,), and so on. Hence we can 
conclude, precisely as in [6], that the set of points of discontinuity is 
finite or denumerable. If it is denumerable, and P}, P,, ... is a sequ- 
ence of points of discontinuity, the series formed from the positive 


60 THE STIELTJES INTEGRAL [21 
numbers G(P;) is convergent and the inequality holds: 


> C (Px) < G (Ap). (114) 
kal 

Similarly, if l is the piece of straight line parallel to one of the axes 
that appears in the constitution of Ap, it is called a line of discontinuity 
if G(l) > 0. Every straight line parallel to an axis and passing through 
a point of discontinuity gives a line of discontinuity. But there can 
also be lines of discontinuity that contain no point of discontinuity. 
It can be shown precisely as above that, if lines of discontinuity 
exist, the set of them is either finite or denumerable. Here, we take 
the complete segment of straight line parallel to an axis that appears 
in the composition of 4o, without cutting it into pieces. Suppose we 
have a sequence of intervals 4, (n = 1, 2, ...) such that 4, contains 
An41, and the point P or all the points of the line J are the only points 
common to all the 4,. We say in this case that 4, is a system of embed- 
ded intervals tending to P or I (l is a straight segment parallel to one 
of the axes). By making use of the normality of G(A), it can be shown 
that, if 4, is a system of embedded intervals tending to P or l, then 
G(4,) > G(P) or G(l). We give the proof for the case of a point P, on 
the assumption that P lies inside all the 4,, which are open. We draw 
through P straight lines parallel to the axes, and split each of the An 
into the following pieces: the point P, the four segments of the straight 
lines drawn that lie in 4,, and the remaining four intervals. On indefinite 
compression of the 4, to the point P, all the constructed elements of 
subdivision, apart from P, will form a vanishing sequence of intervals, 
and G(A) will tend to zero for each of them, by virtue of the normality 
of G(4). On recalling that G(4) is additive, it is now easily seen that 
G(4,) — G(P). The other cases for a point P, and the cases for a line lf 
can be similarly treated. 

The definitions given above become perfectly clear if G(4) is inter- 
preted as the mass located on the interval 4 when matter is distrib- 
uted over the basic interval 4). If say G(P) > 0, we have a concen- 
trated mass G(P) at the point P. Similarly, if G(2) > 0, the mass G(J) is 
distributed in some manner along the line 7. For instance, suppose we 
have a semi-open interval 4, (0 < 2 < 2; 0 < y < 2), and that mass 
is distributed in it with unit linear density along the line lọ (x = 1, 
0 < y < 1). In this case G(A) is equal to the length of the part of le 


+ Later on, we shall prove this property of G(4) not only for intervals, 
but for much more general point sets. 


22] PASSAGE TO POINT FUNCTIONS 61 


which is contained in 4. At any point P of A), the function G(P) = 0. 
If 4, is a system of embedded intervals tending to l, their areas tend 
to zero, whilst G(4,) = 1 for all n. 

A function G(A), defined for all intervals belonging to 4, may 
readily be extended in general to all intervals of the plane, the function 
meantime remaining non-negative, additive and normal. For, if A is 
any interval, and 4A, is the product of intervals A and 4, i.e. the set 
of points belonging simultaneously to A and 4), this product is an 
interval belonging to 4, and we obtain our extension of G(4) if we 
put G(A) = G(4A,). This extension of G(4) may easily be interpreted 
physically if we regard G(4) as the mass contained on the interval 4 
when the total mass is distributed over the interval 4p. 

We shall see later that, if a non-negative, additive and normal 
function is given only on semi-open intervals, it can be uniquely 
extended not only to whole intervals, but also to a much wider class 
of point sets on the plane whilst retaining the properties mentioned. 
Its values at points and on lines / can be obtained, as indicated above, 
by a passage to the limit, with the aid of a system of embedded inter- 
vals, the limit being independent of the precise system of embedded 
intervals taken. 

A function G(4) can easily be split into a jump function and a con- 
tinuous part. Let G(4) be non-negative, additive, normal and defined 
for all 4 belonging to 4,. We write Py (k = 1, 2, ...) for its points of 
discontinuity. We define the jump function G,(4) as follows: G4(4) is 
equal to the sum of the values of G(P,) at the P, which belong to 4. 
It may be remarked that, if there is a denumerable set of such points, 
the series consisting of the G(P,) is convergent. The difference G(4) — 
— G,(4) will be denoted by G,(4). This latter function has no discon- 
tinuities. Both the functions G,(4) and G,(A) are easily seen to be 
non-negative, additive and normal. The normality follows at once 
from the inequalities: 0 < Guld) < G(4) and 0 < GA) < G(d). We 
write our expansion of G(4) as 


G (A) = @,(4) + G, (4). 


22, Passage to point functions. We can form an interval function 
G(A) by using a point function g(x, y). Suppose we have a point 
function g(x, y), which is defined if x belongs to the interval A 
of the X axis, and y to the interval A‘ of the Y axis, where A 
and ay define a basic interval 4, on the plane. Suppose further that 
g(x, y) is non-decreasing with respect to each variable for any fixed 


62 THE STIELTJES INTEGRAL [22 


value of the other, and that 
gle + hk, y +k)— glz +h, y)— glz, y +k) +g(z,y)>0 (115) 


for any z and y and any positive h and k. It follows from what has been 
said that the limits g(x, y + 0) and g(x + 0, y) exist, where 


(1) g(%,y¥+0) > 9(%,y+ 9); g (22, Y — 0) > g (21, Y — 0) 
for x, > z, and 

(2) g(x + 0, Y2) > g(x + 0, y1); g (x — 0, Y2) > g (x -- 0, Yı) 
for yz > Yr 


In view of the monotonic property expressed by these formulae, 
we can pass to the limit successively with respect to the variables. 
We form the limits 
A= lim limg(x+h,y+k) and B= lim limg(@+h, y+ k) 

k>+0 h—+0 h+40 k++0 
and show that they are equal. We have g(x +h, y + k) > o(a + A, 
y + 0) and, on passing to the limit first with respect to h, then with 
respect to k, we get A > B. Similarly, it can be shown that B > A, 
so that A = B. We naturally write the symbol g(x + 0, y + 0) for the 
quantity A = B. It can similarly be shown that 
lim lim g(x —h,y—k) = lim lim g(x — h, y — k), 
k~+0 A~4-0 h++0 k>+0 
and this limit is naturally written symbolically as g(x — 0, y — 0). 
So far, we have only used the fact that g(x, y) is non-decreasing with 
respect to each variable. We now make use of condition (115). We form 
the limits 
A = lim limg(z+h,y—k) and B,=lim lim g(x+h,y—k) 
k-> +0 h~ +0 k—+0 h~+0 


and show that A, = B,. Putting 0<h,<hand0<k, < k, we can 
write, in view of (115): 


g(x + h, y — kı) — g(x + h y — k)— g(x + h, y— k) -+ 
+9 (t+ hny — k) > 0. 
On letting first h,, then k, tend to zero, we get 
gz + h, y —0)— g(x +h, y—k)— A + 9(@+0,y—k) >o. 


If we now let first h, then k tend to zero, we get B, — A, > 0, i.e. 
B, > 4, It can similarly be shown that A, > B, so that A, = B,. 


22] PASSAGE TO POINT FUNCTIONS 63 


We naturally write G(x + 0, y — 0) for the quantity A, = B. 
Similarly, 


lim lim g(x — hk, y +k) = lim lim g(x — h, y+ k), 
k++0 h++0 A++0 k-++0 


and we write g(x — 0, y+ 0) for this repeated limit. It may easily be 
shown that, in all the four cases considered, the same limit is obtained 
for any simultaneous convergence of h and k to (+0). For instance, 
we can assert the following: given any positive ¢, there exists a positive 
n such that 

for O<h<n and O<k<yn. 

The symbols g(z + 0, y + 0) thus have a definite meaning when 
condition (115) is satisfied. A non-negative, additive and normal 
function G(4) is formed with the aid of g(x, y) as follows. Let 4, and 
Ay be intervals on the axes defining the interval 4 on the plane, and 
let a and b be the boundary points of 4,, and c, d the boundary points 
of A,. Now G(4) is equal to the following expression: 

g(b+0,d +0) -—g(b+0,e+0)—-g@+0,d4+0)+ 
+9(a@+0,¢+ 0), 
where the + sign is taken when the interval is closed at the right-hand 
end or open at the left-hand end, and the — sign when the interval is 
open at the right-hand end or closed at the left-hand end. For instance, 
we have in the case of the closed interval a < æ < b, c< y <d, 
G (A) =9(6+0,d+ 0) —g(b+ 0,c—0)—g(a—0,d +0) + 
+ g(a — 0, c — 0). 
In the case of the semi-open intervala < x < b,c <y <d: 
G(4)=g(b + 9,d +0) —g(b+0,¢+0)—-ga@+0,d+0)+ 
+ 9(a+0,¢+0) 
and in the case of the point z = a, y = c: 
G (4) =g(a+0,¢+0)—g(a+0,c—0)—g(a—0,¢+0)+ 
+ g(a — 0, c — 0). 
Conversely, if G(4) exists, we can easily form the point function 


g(x, y), with the aid of which G(4) can be obtained by the method 
indicated. Suppose say that G(4) is given throughout the open plane. 


64 THE STIELTJES INTEGRAL [23 


Now, g(x, y) can be taken equal to the value of G(4) for the interval 
Axy, which is defined by the following intervals on the axes: —co < 
< g’ oa; —© < y’ < y. The g(z, y) thus constructed is continuous 
with respect to x and y from the right. It may be observed that the 
left-hand side of (115) is unchanged if an arbitrary polynomial of the 
first degree in x and y is added to g(x, y). If g(x, y) is a continuous 
function, the G(A) corresponding to it has no points or lines of dis- 
continuity, and vice versa. If g(x, y) has a discontinuity at the point 
(x, y), but 


g(z+0,y+ 0) —g(z—0,y+ 0) — glz +0, y— 0) + 


this point is not a discontinuity of G(4). 

Let us take as an example the above-mentioned mass distribution, 
in which mass is distributed with unit linear density on the segment 
z=10<y< 1. Here g(x, y) =O0ife<lory< o0; glx, y) = y if 
z> land0< y< l; g(x, y) =lifx > landy>l. 

A similar treatment can be given of intervals in three-dimensional 
space with axes X, Y and Z.Suchan interval is defined by three intervals 
on the axes. In addition to points, and lines parallel to the axes, we 
also have to consider planes parallel to the coordinate planes. Other- 
wise, all the arguments and results are the same as above. 


23. The Stieltjes integral on a plane. The concept of Stieltjes integral 
is readily generalized to the case of a plane. Obviously, we are con- 
cerned here with the subdivision of a two-dimensional interval. If 4 is 
an interval of the plane, defined by intervals 4, and A, on the axes, 
a subdivision of 4 is a division obtained by dividing 4, and A, into 
sub-intervals. Every sub-interval of A is defined by a sub-interval of 
A, and a sub-interval of 4y. Figure 1 illustrates a subdivision of a 
semi-open interval, carried out as indicated, into six sub-intervals. 


y 


as 


Fie 1. Fig 2. 


23] THE STIELTJES INTEGRAL ON A PLANE 65 


A subdivision of quite a different type is illustrated in Fig. 2; the 
dividing lines here end in the middle of 4, and it cannot be obtained 
by the above method of subdividing 4, and 4,. But we can easily pass 
from a subdivision of the second type to a subdivision of the first type 
by continuing all the dividing lines, and we shall in future only use 
subdivisions of the first type; this is a matter of minor importance. 
A subdivision ô’ is described as an extension of a subdivision 6 if the 
subdivisions of 4, and A, corresponding to 6’ are extensions of the 
subdivisions of A, and A, corresponding to ô. If 6, and 6, are two sub- 
divisons of A, and 6, 6@ and 5M), 6) are the corresponding sub- 
divisions of 4, and 4y, the product 6, 6, is the subdivision given by the 
subdivisions 6{) 62 and 6°) 6° of A, and Ay. The subdivision ô, 6, 
is evidently an extension of ô; and 6,. A further remark: if 4’ is an 
interval belonging to 4, there exist subdivisions of 4 in which 4’ is 
one of the sub-intervals. 

It is easy to form an analogue of the Stieltjes integral in the 
case of a plane, three-dimensional space, and in general n-dimensional 
space. We shall confine ourselves to the plane case. The constructions 
are essentially the same in the other cases. 

Let A, be a finite interval on the plane, on which are given a point 
function f(P) which is uniformly continuous and therefore bounded 
and a non-negative, additive and normal interval function G(J). 
Let ô be a subdivision of 4, into intervals 4,, 4,,..., An, pairs of 
which have no common points. We choose a point P, in each A, and 
form the sum 


n= SIPAG). (116) 
k=1 


As in [4], it can be shown that this sum has a definite limit when 
the greatest of the diameters of the domains 4, tends to zero. This 
limit is in fact the integral of f(P) with respect to G(4): 


(fF (P)@ (dd) = lim > f (Py) G(A,). (117) 
Ay k=! 


If f(x,y) is continuous in the closed interval 4, (a, < æ < bj; 
a, < y < b) and g(x, y) has the above-mentioned properties, the 
Stieltjes integral can be defined as the limit of the sum 


o, = = D (Se M19 (Fe Y) — 9 (Eris Yi) — 9 (Lrs Ya) + 
+g (Err Yr-a)], (118) 


P q 
1 


t=1 


66 THE STIELTJES INTEGRAL [24 
where 
A= My SB < 1. <L Ep L p= b =H YK <M... < 
< Yq < Yq = bz; 
Ly S Èk S Tk; Yı SMSY (119) 


on indefinite refinement of the sub-intervals. 

As in [4], it can be shown that the Stieltjes integral exists, on the 
assumption that f(P) is continuous inside 4, and bounded, if G(A) 
satisfies a subsidiary condition which we state next. 

Let A™ be closed intervals lying inside A), which expand and tend 
to A, in such a way that any interior point of 4, comes inside A” for 
sufficiently large n. We require that G(A™) + G(A,). This is analogous 
to the continuity of g(x) at the ends of the interval, which we spoke 
about in [4]. 

The Stieltjes integral can be defined for the whole of the plane: 
—œ < g <+, —co <y < +00, which we write as Q. Let G(4) be 
a non-negative, additive and normal function, defined for all intervals 
both finite and infinite belonging to Q. Let A” be a sequence of inter- 
vals indefinitely expanding in all directions, e.g. let A" be —n < a < 
<n; —n<y <n. Since G(4) is normal, G(4™) — G(Q)—> 0 as 
n—» œ., If f(P) is continuous and bounded in Q, the Stieltjes integral 
(117) exists. Here, the sequence of subdivisions must be such that, 
given any fixed n, the greatest diameter of an interval having points 
in common with A” tends to zero. 

The domain of integration may not be the interval A, but may be 
a domain S which represents the sum of a finite number of intervals. 
We can perform as many subdivisions as desired, form sums (116), 
and pass to the limit. The integral over S reduces to the finite sum of 
integrals over the intervals into which S can be split, and obviously 
does not depend on the method of subdividing 8S. 

The properties of the double Stieltjes integral are precisely analogous 
to those given above for the simple integral. 


24, Functions of bounded variation on the plane. The treatment of 
functions of bounded variation on a plane is in many respects similar 
to the above. The statement will be somewhat different since our 
discussion will be in terms of interval functions instead of point func- 
tions. Let G(4) be additive and normal and defined for all intervals 
(in the usual sense of this word), belonging to some basic interval 


24] FUNCTIONS OF BOUNDED VARIATION ON THE PLANE 67 


Ay. We shall not assume that this G(4) is non-negative. Let 4,,..., An 
be a subdivision 6 of the interval 4, into sub-intervals. We form the 


BUMS 
n 


h= S14 4l- (120) 


k=1 


DEFINITION. If, given all possible subdivisions 6, the set of values of 
t, is bounded, G(A) is said to be a function of bounded variation in the 
interval A, whilst the strict upper bound of these sums t; is called the 
total variation or simply the variation of G(A) in the interval A,. We 
shall denote it by the symbol V4, (G). The properties of sums ¢, and 
of the total variation are precisely similar to the properties discussed 
in [8], and we shall state most of these properties without proof. 

If 6’ is an extension of the subdivision 6, then tẹ > tẹ. If G(d) is of 
bounded variation in 4,, it is of bounded variation in any interval A’ 
belonging to 4y, where V4. (G) < V4, (G). Any non-negative or non- 
positive function G(4) is of bounded variation. If the interval 4’ 
belongs to 4), we have 


| (4")| < Va (G), (121) 


and G(4), of bounded variation in 4, will be bounded (in absolute 
value) for all 4 belonging to 4). Every linear combination of functions 
of bounded variation is of bounded variation. Theorem 3 of [8] holds 
for a product and quotient. The total variation V ,(@) is a non-negative 
function defined in 4). We can show by repeating the proof of theorem 
4 of [8] that V(4) = V (G) is additive. Let us show that V(4) is a 
normal function in 4). Let Am (m = 1, 2,3,...) be a vanishing 
sequence of intervals. We have to show that V(4,,)—»> 0. Let £ be a 
given positive number. We take a subdivision 6: AM, AP, 2, AP 
of A, for which 


SG (4) | > V (A) — €. (122) 
k=1 


For any k of the series of numbers k = 1, 2, 3, ..., p, the product 
Am A” (m = 1, 2, ...) is a vanishing sequence. Since G(A) is normal, 
we can fix an m = m, such that 


|G (4,4) | a for m>m, 
(123) 
(k =1,2,...,p). 


68 THE STIELTJES INTEGRAL [24 


We fix any m > m,. Each interval A“ is subjected to a further sub- 
division such that Am 4” is a sub-interval. Let 4“ (s = 1, 2, ..., na) 
be the remaining sub-intervals with this subdivision of 4“. A sub- 
division of 4, is thus obtained which is an extension of subdivision 6, 
so that inequality (122) holds for it all the more, i.e. 


S16 4n A+ È Seala V (Ay) — €. 
=] $=1 


Since V(4) is additive, and | G{4® | < V(A®), the last inequality 
gives us 


ŠIO nA + È SV (AM) > Vm) t SV (A) — 
k=) s=1 k=l s=1 


or 
p 
V (An) < S\G(4,4)|+e for m>m. 
k=l 
On taking (123) into account, we get 
n< S +e=2e for m>mp, 
ka P 


whence, in view of the arbitrariness of e, it follows that V (Am) > 0. 
Thus V(A) is a non-negative, additive and normal interval func- 
tion in 4, We define further non-negative, additive and normal 
functions in accordance with the formulae 


G,(4) =F [V(A)+@(A)]; Q, (4) =5 i (4)—G@(4)} (124) 


and thus obtain the canonical form of a function of bounded variation 
G(A) as the difference between two non-negative, additive and normal 
functions: 


G (A) = G, (4) — @, (A). (125) 
Given some analogous form 
G (A) = Gt (4) — GF (4), (126) 
we have for any A belonging to A): 
G,(4)<@¥(4) and G@,(4) < Gf (4). 


Conversely, if G(A) is the difference between two non-negative addi- 
tive and normal functions, G(4) is a function of bounded variation. 


25] THE SPAOE OF CONTINUOUS FUNCTIONS OF SEVERAL VARIABLES 69 


If the interval 4 is the point P, V (P) coincides with G(P). If G(P) = 
= 0, then V(P) = 0, ie. the continuity of G(4) at a point implies the 
continuity of V(4) at that point. The situation is rather more compli- 
cated for segments parallel to the axes. Let us interpret G(4) as the 
amount of charge distributed in the interval 4. If electrical charges 
whose total sum is zero are distributed say with definite linear density 
on some segment l of a straight line parallel to an axis, G(l) = 0. 
On the other hand, we have V(l) > 0, since V (2) gives the total sum 
of the charges when all the charges are taken with the plus sign. 

An additive and normal function G(4) could be defined only on 
semi-open intervals and use made only of subdivisions into semi- 
open intervals. All the arguments given above could be repeated in 
this case, and (125) obtained. Non-negative, additive and normal 
functions G,(4) and G,(4) of a semi-open interval could be extended 
to all possible intervals, and hence (126) would give us an extension 
of the given G(A) to all possible intervals, and this function would be 
additive, normal and of bounded variation for all possible intervals. 

By using canonical form (145), we can define the integral with 
respect to G(4) in the usual way: 


f f(P)@ (dA) = f f(P)G, (d4) — f f(P)G@, (dd). (127) 


Ae) Ae) a) 


Given a closed interval A (a, < x < bı; a < y < b,), by using expres- 
sion (115) we can define a function of bounded variation g(x, y) and a total 
variation as in [8], by starting from the sums 


p 
ts = PA 2 19 (Ers Y) — I (Ek-15 Y) — F (Lk Yi-1) +9 (Tk-1 Yi-1) |- (128) 
25. The space of continuous functions of several variables. Let f(x, y) be a 
continuous function given in a finite closed interval 4,(a, < x < bi; @ < y< b). 
With the aid of the linear transformation x’ = ax + b, y’ = cy + d (a and 
c #0), we can reduce 4, to the interval 0 < 2’ < 1;0 < y’< 1), and a 
continuous function remains continuous after transformation. We shall there- 
fore assume that the original interval A, has already been transformed to 
O0<2x<1,0<y< 1. We form the Bernshtein polynomials [II; 154] for 
fz, y): 
m n 


Punay = r=. J ohon at y! a —a)™*— yy! (129) 
k=01=0 

As in [II; 154], we can show by using the uniform continuity of f(x, y) in 

A, that Pm, n£, y) — f(x,y) on indefinite increase of m and n uniformly in 

Ay. Now suppose that we have an f(x, y) which is continuous on the bounded 

closed set F [IV; 157]. By using a linear transformation as above, we can 

assume that F belongs to the A, indicated above. Further, we can extend 


70 THE STIELTJES INTEGRAL (26 


f(x, y) to the whole of 4, whilst preserving its continuity and the maximum 
of | f(x, y)|{IV; 157]. We can construct for the f(x, y) thus extended a sequence 
of polynomials Py, n(x, y) such that Pm, nlx, y) > f(x, y) uniformly in 4, and 
all the more uniformly in F. 

We now give a treatment similar to that of [14] of the space C of functions 
continuous in A, with the following definition of the norm: || f || = max | f(x, y) | 
in A,. As in [15], it can be shown that the general form of linear functional in 
C is the Stieltjes integral 

 (f) = f f(a, y) G (ad), (130) 
4, 
where G(A) is the function of bounded variation in 4, characterizing the func- 
tional (f). 

When. defining a function of bounded variation, use can be made of sums 
(128), whilst integral (130) is defined as the limit of sums (118) on indefinite 
subdivision. Information on the space of continuous functions on bounded 
closed sets can be found in Radon’s article ‘Linear functional transformations 
and functional equations” (O lineinykh funksional’nykh preobrazovaniyakh i 
funktsional’nykh uravneniyakh) (Uspekhi matematicheskikh nauk, Vyp. 1, 
1936) and in the book by F. Riesz and B. Szekefalvi-Nagy, Lectures On Func- 
tional Analysis. 


26. The Fourier- Stieltjes integral. Let us consider a function expressible by 
the Fourier-Stieltjes integral 


+ 0 
p(t) = § el dg(z) (~w<t<+o), (131) 
where g(x) is a non-decreasing bounded function, continuous for x = + œ, 
i.e. 


g(—%)= lim g(a); gre lim g(x). 


X—— a 


Integral (131) obviously exists, since the function e* is continuous and bounded 
[4]. The elementary properties of the function p(t) are as follows. We have 


+o +> 
Ip t)|< f je ]dg(z)= J dg(x) =g(4+ %)— g(— œ) = p (0), 


or | g(t) | < p(0), i.e. p(t) is bounded. The identity 
g(—t)= p(t) (132) 


also follows at once from (131). 
Let us show further that g(t) is uniformly continuous in (— œ, + œ). We 
have for the absolute value of y(t + h) — y(t): 


y 





+c 
jp th) —9 1 < f jelt] age =? sin | äg (2). 


(133) 


26] THE FOURIER-STIELTJES INTEGRAL 7 


We first make n so large that 
[g(—-n)—g(—~)] <e;  [9(+)—gi(n)] <e. 


We next fix an 7 independent of ¢ such that, in the interval —n < x < n: 


2 





ine | <e for |A| <n. 


With this, we obtain by (133): 
|p +R) — g (t) | < [2 +g (n) —g(—n)Je< [2 +g (+ œ) — g(— %)]e, 


whence follows the uniform continuity of g(t), since ¢ is arbitrary and 7 is 
independent of ¢. To prove a further property of p(t), we take any m real numbers 


t,, ta, ..-,¢, and form the Hermitian form: 
X P AEAF (134) 
peq=l 


in the variables é. On taking (131) into account, we can write 
+œ 
P (tp — tq) Ep R= J ellp* by og §, dg(a), 
and we get the following expression for the Hermitian form (134): 


> vt ip — t) Sp Fy = i Seri 


P,q=l — o 
whence it follows at once that, given any m and any choice ¢, of the values of 
t, the Hermitian form (134) is non-negative, i.e. 


Ee (x), 








m 
D Pty — ty) bp & <0. (135) 
p.q=l 

Let us introduce a new concept. We shall call g(t) positive definite if it is 
continuous and bounded in (— œ, + œ), satisfies identity (132), and if, in 
addition, the Hermitian form (134) is non-negative for any m and any choice 
of points ¢,. It follows from the above arguments that, if the function p(t) is 
expressible by a Fourier-Stieltjes integral (131) with a function g(x) of the 
indicated type, it is positive definite. It can be shown that, conversely: every 
positive definite function g(t) is expressible by integral (131) with a function 
g(x) of the type indicated. (Bochner, Vorlesungen über Fourtersche Integrale 
p. 74, or Math. Annal. Bd. 108). Let us return to integral (131) and write 
g(x) as the sum: g(x) = g(x) + g,(z), where g,(x) is the jump function and 
g(x) is the continuous part of g(x). The function g(t) can now be written 

as the sum g(t) = 9,(¢) + plt), where 


+o te 
p(t) = f eM dga(x); palt) = f eM dg, (x). (136) 


— o — æ 


Let 2, (k = 1, 2, ...) be the abscissae of the points of discontinuity of g(x) 


72 THE STIELTJES INTEGRAL [26 


and a, = g(a, + 0) — g(x, — 0), where obviously a; > 0, and the series con- 
sisting of the a, is convergent. 
We have an expansion of 9,(t) as a uniformly convergent series: 


pı (t) = a ape” Kf, (137) 


If the number of points x; is finite, the sum written will be finite. 
We bring in the so-called mean value of any continuous function F(t), 
defined in the interval (— œ, + œ), 


+ 
` 1 
M {F (t)} = lim Er Í F (t) dt, (138) 
if the limit on the right exists. It may easily be shown that, for the ¢,(t) defined 


by (136) with the continuous function g,(x): 
M {92 (t)} =0. (139) 


For, on performing the integration under the integral sign, we get 


1 +w +00 4 

sın wT 
= femac= |F agt). 
—w 





v 


oo 


We split tho interval of integration [— œ, + œ] into three parts: [— œ, —a], 
[—a, +a], [a, +0]. Using elementary inequalities for the integrals, we have 


l w 
ifaw at| < 
—w 


1 
<} [ae(— a) — ge (— 29) + ao Lge (0°) — ge (0)] + ge (a) — ge (— a]. 





Let e be a given positive number. In view of the continuity of g(x) at = 0, 
we can choose an a so small that g,(a) — g,(—a) < «/2. With a fixed, the first 
two terms on the right-hand side tend to zero as œ — œ, so that, for all suffi- 


ciently large w: 
+a 


|+ f me ae| <E, 


—-a 


whence (139) follows, since « is arbitrary. The possibility of integrating with 
respect to ¢ under the Stieltjes integral sign is easily justified. We now consider 
the function 9,(t) e~!#, which we can write in the form 


- 


+ 
Pz (the Max f of @- I dg, (x), 


or, on changing the variable of integration, 


+o 
gz (the = f eM dg. (e+ A), 


— o 


27] INVERSION FORMULA 73 


and, since g,(x) is continuous, we have for any real A: 
M {p2 (the ™} = 0. 
Further, if we make use of the uniform convergence of series (137) in the 
interval (— œ, + œ), as also the fact that 


M {ol} =0 for A¥<0, 
we obtain 
M {p (to = ay (Ay = 2), 
and 
M {p (8) 07} =0, 
if A does not coincide with one of the xg. The above arguments lead us to the 
following general result: if g(¢) is expressible by integral (131) with a non- 
decreasing bounded function g(x), then M {g(t) e~} = g(A + 0) — g(A — 0), 
and the right-hand side is zero if Å is not a point of discontinuity of g(x). The 
mean value of the product p(t) e~' represents a generalization of the Fourier 
coefficient for a periodic function. We shall return later to generalized Fourier 
coefficients in connection with a generalization of the closure formula [29]. 
The next section is concerned with an inversion formula for integral (131), i.e. 
a formula expressing g(x) in terms of p(t). 


27. Inversion formula. In Volume II, we proved the second mean value 
theorem, the possibility of expanding a function in a Fourier series, and the 
Fourier integral formula, on the assumption that the function in question 
satisfies the Dirichlet conditions. On glancing back over the proofs, it may 
easily be seen that they retain their force if the Dirichlet conditions are replaced 
by the requirement that the function be of bounded variation. We have the 
following result for the Fourier integral: if f(x) is a function of bounded variation 
in any finite interval and is absolutely integrable in the Riemann sense over 
an infinite interval, the formula 


+ 
l iix 
t) = —— x) e~ dz 140) 
=e f0 ( 
implies the following inversion formula: 
+o 
f(a) =a [reas (141) 
V2 


where the last integral has to be understood in the sense of the principal value 
and f(x) on the left-hand side must be replaced at a point of discontinuity by 


> (f(a — 0) + f(z + 0)]. This result is sometimes written in a rather different 


form, viz. the formula 
g(t) = T eM de (142) 
implies that [TIJ,; 130]: 7 
; +e! i +N i 
f(a) =a V'P[ oo eM d= 5 lim [pie Mat. (143) 


74 THE STIELTJES INTEGRAL [27 


Our problem is to construct an inversion formula for integral (131). The 
form that this formula must have may easily be guessed. We use the following 
heuristic method, which naturally does not have the force of a proof. We 
replace dg(z) in integral (131) by g’(x) dz, on the assumption that g'(x) is the 
derivative of g(x). We thus get 


+ o 
p(t) = f g (xe! da, 


and the inversion formula for the ordinary Fourier integral gives us 
i +N 
, ey, || —ilx 
g’ (x) = on Re g (t)e dt. 


On integrating both sides of the last formula with respect to x, say from 

æ = 0 to some value zv, and carrying out the integration on the right-hand side 
under the integral sign, we finally obtain the inversion formula: 

1 tN l—e itx 

— g (0) = — li — 

g (x) — g (0) = z7 pm { p (t) —7 





dt. (144) 


The value of g(x) at a fixed point x = 0 has appeared on the left-hand side, 
since g(x) is evidently only defined up to a constant term. We must now 
provide a strict proof of (144). 

Having fixed the value of x in some way, we consider the following function 
of the variable y: 


h (y) =g (y + x) — g (y). (145) 


Since it is the difference between two increasing functions, it will be of bounded 
variation in any finite interval. Let us show further that it is absolutely inte- 
grable over an infinite interval. We shall assume for definiteness that x > 0. 
The proof is essentially the same if « < 0. The function g(y) is increasing, so 
that we can write 


q q q q 
f aaye lg ly +2) — g (y)]dy = { g (y +2) dy — J g (9) dy, (146) 
Pp 


(q > p} 
or, on carrying. out the change of variable: y +% = z in the first integral: 
q q+x q q+x +x 
f IA dy= f g(z)dz— | g(e)dz= f g(z)dz — Í g (2) dz. 
P p+x p q P 


On taking into account the fact that g(p) < g(z) in the interval [p,p -+ x] 
whilst g(¢ + x) > g(z) in the interval [g, g + x], we get 


{ley lay < [glg +2) —9(p)]2. 
p 


23] CONVOLUTION THEOREM 75 


It is clear from this that integral (146) is as small as desired for sufficiently 
large p and any q > p, which in fact proves that A(y) is absolutely integrable 
over an infinite interval. Fourier’s formula is therefore applicable to function 
(145): 

+ 
oly +2) —g (V) =p lim fe -0 | f toeta) gee az]ar 
N= 
or, with y = 0: 


+N +œ 
g(2)~90)= l n | [ f to@-+2)—g eel ae | dt. (147) 


Let us consider the inner integral J and apply to it the formula for integration 
by parts: 

+a +o 
ak = del? = _ | feta 
= T g (z +x) — g (z)] de mary: © [g (z + x) —g(z)], 


whence, on using (131), wo obtain 
1 
I=— pit) — oe i ett dg (z +- æ). 


On carrying out the change of variables of integration: z -+- x = u in the 
last integral, we finally arrive at the following equation: 


+o 


f [g (z + æ) — g (z)] e dz = 


itx 


ae ey (148) 


it 


On substituting this in (147), we obtain the inversion formula (144). 

We recall that the left-hand side f(x) in (143), giving the inversion of the 
ordinary Fourier transformation, has to be reckoned equal to the arithmetic 
mean of its limits from the left and right at points of discontinuity; we know 
this from [I{; 143]. Consequently, the same remark holds for the left-hand 
side of (143) and the left-hand side of inversion formula (144). Inversion for- 
mula (144) for integral (131) also holds when g(x) is a function of bounded 
variation. To see this, we only need to write g(x) as the difference between 
two increasing functions, split integral (131) into two integrals and apply to 
each the inversion formula proved above. 


28. Convolution theorem. As we know, the inversion formula for the ordinary 
Fourier integral is directly connected with the inversion formula for the Laplace 
integral [IV; 44, 45], and we had a convolution theorem for the latter integral. 
A similar theorem holds for the Fourier-Stieltjes integral. Let g,(x) and g,(x) be 
two functions with the properties indicated above for g(x). The following general 
Stieltjes integral will exist: 


fo 


ga (t) = È ge(t— x) dg, (2). (149) 


— a% 


76 THE STIELTJES INTEGRAL {28 


It is easily shown that g(x) has the above-mentioned properties and that, 
if g(x) and g,(x) are continuous, g(x) is also continuous. 
We form the Fourier-Stieltjes transformation for functions g;(x): 


+o 
v(t) = J ol doy (2) (150) 
The convolution theorem amounts to the assertion that these transformed func- 
tions satisfy the following simple equation: 
Ya (t) = Yı (t) Po (t). (151) 
We apply (148) to function y,(t): 


1 oTi! 


+ so 
=F wlt= f [eto ge] ae. 


On replacing g,(t) by its expression (149), we get 


1— et +e +e 
aM ff etua) -g e a] ag (0) olde. 


We change the order of integration. We shall not dwell on the proof that this 
is possible, since it will follow from a general theorem on changing the order 
of integration which will be proved later. We get 


— e- iut T TE 
1 n= f y f ietu a) gle — a)] ol dep dg (a). 


We replace the z in the inner integral by the new variable of integration y 
given by z — x = y. The last equation can now be written as 


— a` itt Ta oe 
Izè (= f f [92 (y + u) — 92 (y)] of dy} o™™ dg, (2). 


The inner integral can be expressed in terms of y,(t) in accordance with 
(148), and we arrive at the formula 
f te 
= eo fut l— e-i! . 
ce amare (t) = mar aml (t) i e™* dg, (£), 


which, by (150), is the same as (151). 

It follows at once from the convolution theorem that, if y,(t) and y,(t) are 
expressible by an integral of type (131), i. belong to the class P of positive 
definite functions, the same can be said of their product. It is immediately 
obvious that the linear combination c,p,(t) + cpa(t) with positive coefficients 
also belongs to class P. Further, if y(t) belongs to P, p(t) belongs to P, viz, 
if g,(x) defines y(t), we have for p(t): 

+ 


æ + 20 
p= f e dg (z)= J eM dg, (a), 


29] THE CAUCHY-STIELTJES INTEGRAL 77 


where g,(z) = —g,(—). Thus | y(t) |? will belong to P and will be defined by 
the function 


+ 
h(z)= Í g,(z—t)dg (2). 


If g,(z) is continuous, k(x) will be continuous, and we have, by what was proved 
in (26): 
M {| (¢) |?} = 0. 


Let us return to the function g(t) defined by (131), and, as above, let a, be 
generalized Fourier coefficients corresponding to 4 = åp. On using (137) and 
the fact that the positive a, form a convergent series, we get 


Diag =M {|e @) P} (152) 


On the other hand, by what has been proved, 


M {| 92 (¢) P}=0. (153) 
We have further: 


| (P) (6) |2 = | p1 (t) + Pe (6) |2 = | pi (E) 2 + | Pa (E) [2 + Pi (E) Pa © +: © Pelt). 
In view of the Buniakowski-Schwartz inequality 





1 +a 2 l +o 1 +o 
— [aonba < — | | pı (t) Pde» — f | Pe (t) |? dt, 
w w w 

—w -w —w 





together with (152) and (153), we can say that the right-hand side tends to zero 
on indefinite increase of w. Hence, M{9,(t) ¢.(t)} = 0, and similarly, 
Mpt) pa(t)} = 0. Thus the following closure theorem follows from (152) 
and (153) for any function of class P: 


Sak= M{\o/}. (154) 


29, The Cauchy -Stieltjes integral. We take the Cauchy integral over 
the entire real axis: 

+% 

v—z 


o (2) = | 22 ae. (155) 


Given certain assumptions regarding p(x), this integral exists, and 
the function w(z) of the complex variable z is regular both in the 
upper and lower half-plane. These regular functions are different 
analytic functions in the half-planes, and we know that y(x) can be 
expressed in terms of the jumps of «(z) on the real axis, i.e. the formula 
holds [IV; 85]: 


v()=lim gz [0 (e + ti) — o (2 — ti)]. 


78 THE STIELTJES INTEGRAL [29 


Let g(x) be a function of bounded variation in the infinite interval 
[—co, +o], and let us form the Cauchy-Stieltjes integral for a 


complex value of z: 
+ 


o= f 1 dg (x). (156) 


g&—z 





-0 


The integrated function 1/(z — z) is continuous throughout the real 
axis and tends to zero as x —> +œ. Thus integral (156) can be under- 
stood as a Stieltjes integral [4]. We shall prove the following inversion 
formula for integral (156): 


g(x) — g (0) = lim afio (o + ti) -- w (o — ti)] do, (157) 
t—+0 ò 


where half the sum of limiting values from the left and right has to be 
taken for g(x) at its points of discontinuity. We could have made a 
guess at this inversion formula, just as we did above for inversion of 
Fourier-Stieltjes integrals. 

Before proving (157), we consider the Poisson integral for the case 
of a half-plane. We put z = o + ti and separate the real and imaginary 
parts in the Cauchy kernel: 


1 _ z—o t R 
wz (@= FH” + (=o +7? * 

On separating the imaginary part in integral (155) and adding the 
factor 1/z, we in fact arrive at the Poisson integral for the half-plane: 


+% 
1 
F (0,1) = — f g pret (ede. (158) 


It is evidently a harmonic function both in the upper and lower 
half-planes. We shall prove the following as regards this integral: 
if y(x) is a bounded function, Riemann integrable in any finite interval 
(say a function of bounded variation g(x)), integral (158) (which evidently 
exists) tends to y(a) at points where y(x) is continuous and to half the 
sum of the limiting values from the left and right at points of discontinuity 
of the first kind when t tends to 0. This convergence ts uniform with 
respect to o in any closed interval of variation of t lying inside the interval 
where y(x) is continuous. To prove this, we first notice the obvious 
equation: 


oo 


Jape oe = 1. (159) 


0 


2 


T 


29] THE CAUCHY-STIELTJFS INTEGRAL 79 
We replace x in (158) by the new variable of integration y = æ — o: 
a 
Tt 
F (6,7) == Í gape P(Y + 0) dy 
We split the interval of integration into two: [—œ, 0], [0, +œ], 


and introduce the new variable of integration y, = —y into the first 
of the integrals obtained. We thus arrive at the formula: 





F (6,7) ifa aaa A dz. (160) 


Suppose that o is a point where y(x) is continuous, or a point of 
discontinuity of the first kind. We multiply both sides of (159) by 
[yle + 0) + y(o — 0)]/2, and subtract term by term from (160). We 
thus obtain the equation 


0 -0 27 
For) *et% tven—") ~= [5 apg w (a) dx, (161) 
0 
where 
ajeje MEAD were y ee (162) 


Let « be a given positive number. There exists a positive y such 
that | w(x) | < £ for 0 < x < 7. If ø lies in a closed interval contained 
in an interval where y(x) is continuous, in view of the uniform con- 
tinuity of y(x), the number 7 is determined only by £ and is independent 
of o. We split the interval of integration in (161) into [0, 7] and [7, œ]. 
We have for the first interval of integration: 


n n oo 
2 T 2 T 2 
|z f pede] S85) eae ss) a 
0 


As regards an inequality for the integral over the second interval, 
we observe that w(x) is bounded, as follows from the fact that y(x) is 
bounded, i.e. | w(x) | < L, where L is a positive number. We thus have 


2 r T 2 e T 2L 
| f -ve de <L-= |d ==(F — arc tan +), 
n n 
and we get the following inequality for the difference on the left-hand 
side of (162): 


|F (0,1) — vero) AA E= | < e+ ŽE ($ — aro tan 2). 





80 TRE STIELTJES INTEGRAL [29 


The difference contained in the second term obviously tends to zero 
when the positive number t tends to zero, and this second term will be 
less than e for all + sufficiently close to zero. We thus have, for all z 
sufficiently close to zero: 


Foe ae <2 





E, 


from which our above assertion follows, since € is arbitrary. The 
closeness of z to zero is guaranteed by the value of n, which does not 
depend on ø for the intervals in which y(x) is continuous. Hence follows 
the uniformity of the convergence in these intervals of continuity. 
We now turn to the proof of the inversion formula (157). We form the 
function 
oo 
T 


F,(¢,t) = E [o (0 + ti) — w (o — ti)] = + f pope Y). 





On integrating by parts, we can write 


T 


Fy (6,7) = + f 0 2(— yaa) 4e= 


+f 00 
-4 foe d(eaiere) 


Since g(x) is bounded, the improper integral written is uniformly 
convergent with respect to o belonging to any finite interval; by inte- 
grating both sides of the last formula with respect to o over the interval 
[0, £o], where the integration on the right-hand side is carried out 
under the integral sign, we get 


ss f fe (o + ti) — w (a — ti)] do = 
ò 


PE ya a 
== | uoil (2) de— = f epee 9 (0) de. 

The integrals on the right-hand side are Poisson integrals, and we can 
apply the theorem proved above to arrive at inversion formula (157). 
This formula was first given by Stieltjes and is usually known as the 
Stieltjes formula. It must be remarked that the values of the function 
g(x) at the ends of the interval [—œ, +] are not important for inte- 
gral (156), since the integrated function tends to zero as x tends to +. 


CHAPTER II 


SET FUNCTIONS AND 
THE LEBESGUE INTEGRAL 


§ 1. Set functions and the theory of measure 


30. Operations on sets. We shall construct a more general type of 
integral by dividing the basic interval of integration into point sets 
of a general kind, instead of into the usual intervals. Moreover, a point 
set of a more general type will often be used instead of an ordinary 
interval as the basic domain of integration. The first section of the 
present chapter is devoted to a study of such general sets and to 
functions defined on such sets. We start with a study of the funda- 
mental concepts and facts in regard to sets consisting of any elements 
(i.e. not necessarily point sets). Some fundamental concepts and nota- 
tion, of which wide use will be made later, must first be introduced 
for general sets. We shall in fact primarily make use of point sets, i.e. 
sets whose elements are points of a straight line or a plane, or in gene- 
ral of a multi-dimensional space. 

If an element v belongs to a set A, this is written as x € A. If x does 
not belong to A, this is written as x € A. If all the elements appearing 
in a set A appear also in a set B, we say that A is part of B, and write 
Ac Bor BD A.Ifsets A and B contain the same elements, we write 
A = B. If all the elements appearing in A appear also in B, but there 
are elements of B which do not belong to A, we sometimes underline 
this last fact by saying that A is a regular part of B. If Ac B and 
Bc O, it follows that Ac O. Let 


A,, Az, As, eee (1) 


be sets, the number of which is finite or denumerable. The sum of sets 
(1) is defined as the set 7, whose elements are elements belonging to 
at least one of the sets A,. We denote a sum of sets by using the ordi- 
nary symbols 


Z =4 +4 +. or 8, = 2 A. 


81 


82 SET FUNCTIONS AND THE LEBESGUE INTEGRAL {30 


The product of sets (1) is defined as the set ,, whose elements are the 
elements appearing in all the sets A,. A product of sets is written in 
the usual way as 


€,=A,A,... or 6,=]] A,. 


n 
A product of sets may contain no elements. The set that contains 
no elements is called the empty set and is written as 4. For instance, 
if A and B are sets having no common elementst, we have AB=A. 
Sums and products of sets are obviously subject to the commutative 
and associative laws. We have, for instance: 


A+B=B+A; A+(B+D)=(A+8B)+D; 
(2) 
AB = BA; A-(BD) =(AB)-D. 


The distributive law also holds, i.e. 
BSA, => BA,.- (3) 
n n 


To prove this, we have to show that every element x appearing in a 
set on the left-hand side also appears in the set on the right-hand side, 
and vice versa. If x is an element of the set on the left-hand side of (3), 
it belongs simultaneously to B and to the sum of sets Án, i.e. to at 
least one of sets An. Let x € Ap. Hence x € Banda € Ap i.e. œ € BA, 
so that x belongs to the set on the right-hand side of (3). Conversely, 
if x belongs to this latter set, it belongs to at least one of the products 
BAn. Suppose that x € BA,, i.e. x € Band x € Ap. Hence it follows 
that x € B and that x belongs to the sum of sets An, i.e. x belongs to 
the set on the left-hand side of (3), and this formula is proved. If 
Bc A, then obviously A + B = A. Hence it follows at once from (2) 
and (3) that 

(A+ B)(A+ D)=A+ BD. (4) 

We now define a difference. We understand by the difference of 
sets A — B the set whose elements are elements of A not appearing 
in B. If Ac B, then A — Bis the empty set. It must be noticed that 
we do not assume that Bc A when defining the difference A — B. 
If Bc A, we have the obvious formula 

A=B+(A—B). 

We have in the general case 


A+B=B+(A—B). (5) 


+ If the sets A and B have no common elements, they are said to be 
disjoint [ Transl. ]. 


30] OPERATIONS ON SETS 83 


We must mention some formulae which will be useful later. The proofs 
present no difficulty. If A — B = &, and B — A = @,, then A + 8,= 
= B+ &,. If A,c Bn, then 


ACB, and. SB SY 4,C (Banán). (8) 


The following formulae are connected with the concept of difference: 


A—(B—D)C(A--B)+D_ (7); AB=A—(A~—B); (T) 

(Ay — A.) — (Bı — B,) C (A, — By) + (B: — 43); (8) 

A+ B=(A—B)+(B—A)+ AB. (8,) 

We shall next establish the concept of a monotonic sequence of sets, 

and the concept of limit. If Z1, Za» ... is a given infinite sequence of 
sets, and 

ECE CÉG... (9) 


we shall describe the sequence as increasing. A decreasing sequence of 
sets is defined by the condition that 


6,655 6D... (10) 


In case (9), the limit of sets 7, is defined as the set whose elements 
are elements belonging to at least one of the n. We write this as 
g = lim @,,. Notice that, in case (9), if an element x belongs to the 


Nooo 
set Zp, it belongs to all the sets %, for n > k. We have the obvious 
formula in case (9): 


Z= lim Z, = YF, (11) 


n= n=1 


which can also be written as 


& = lim F,=F + > (Frai— F)- (12) 
ioe k=1 

Pairs of the sets appearing in the sum on the right-hand side of (12) 
have no common points [i.e. these sets are pairwise disjoint]. In case 
(10) we say that @ is the limit of sets %, if its elements are elements 

belonging to all the n. We have in this case: 
g = lim 8, = [J Zn (13) 

n=1 


Ti->0o 


84 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [3] 


and furthermore, we can write in case (10): 
GF, =F + S (Fy — Fry). (14) 
k=l 


The terms on the right-hand side of (14) are pairwise disjoint. 
We have defined the limit of a sequence of sets only in the case 
of a monotonic sequence. A general definition could have been given, 
but we shall not dwell on this, since it is of no use to us in what follows. 

A further concept must be introduced specially for point sets. Let 
Z be a set of points on a plane. The set of all points of the plane not 
belonging to & will be termed the complement of Z. This complemen- 
tary set is usually written as OZ. The following formulae for comple- 
mentary sets must be noted. If the concept of complement is applied 
twice, we obtain the original set, i.e. C(C%) = &. If Zic &,, then 
CZ, > Cé@,. The further formulae: 


TL @n.=C DCB, (15) 058, =[[ Fm 09 
n n 
SF, =C][ CF, (16) 08, "08, = 8-8, (19) 
n n 
0 J 2n = S02, (17) €,—6,=8,-C%, (20) 


are proved without difficulty. The concept of complement can 
obviously be brought in for a straight line, when the initial set is 
arranged on the line, or similarly, for any multi-dimensional space. 
The concept of complement with respect to a set A is sometimes 
introduced. If every point of the set Z belongs to A, the complement 
of Z with respect to A is defined as the difference A — Z. We shall 
only use the concept of complement with respect to the entire space, 
i.e. with respect to a straight line, plane, etc. 


31. Point sets, We now introduce some ideas and results that relate 
specially to point sets. When discussing the theory of the multiple 
Riemann integral in Volume II, some information was given in regard 
to point sets on a plane or in any n-dimensional space. We shall repeat 
what was said in Volume II with certain important additions. For the 
sake of definiteness we shall talk about point sets on the XY plane. 
Everything that is said may easily be extended to the case of a straight 
line or any n-dimensional space. 

Let us consider point sets on a plane referred to Cartesian axes XY. 
A set is said to be bounded if the distance of any point of it from the 
origin is less than a definite positive number JN, i.e. z? + y? < N? for 


31] POINT SETS 85 


all points of the set. An -neighbourhood of the point P(a, b) is a 
closed circle with centre P and radius e, i.e. the set of points (z, y) 
satisfying the condition (x — a)? + (y — 6)? < æ. The point P is said 
to be a limit point or point of accumulation of the set % if any e-neigh- 
bourhood of P contains an infinite set of points of &. The point P 
itself may or may not belong to @. If all the limit points belong to &, 
g is said to be closed. A point P of Z is described as an interior point 
if every point of an e-neighbourhood of P belongs to Z. The set & is 
said to be open if every point of it is an interior point. Closed sets are 
usually denoted by the letter F with various subscripts (the French 
word fermé = closed), and open sets by O (ouvert = open). 

The empty set is the “set” that contains no points. Our future 
theorems may embrace the empty set, in which case it has to be 
regarded as both open and closed. The boundary of an open set O is 
the set 1 of points P with the following property: the point P does not 
itself belong to O, but any e-neighbourhood of P contains points of O. 
Since O consists of interior points, we can say that any «-neighbour- 
hood of P contains an infinite set of points of O, and the boundary l 
of an open set O can be defined as the set of limit points of O not 
belonging to O. It may easily be shown that the boundary of an open 
set is a closed set [II; 89]. 

Let & be a set. We associate with it all its limit points and write Z 
for the set obtained. This operation is known as closure of the set @. If 
Z is a closed set, then # = Z. Let us show that @ is a closed set. Let 
P be a limit point of @, i.e. there is an infinite sequence of different 
points P, (n = 1, 2, ...) belonging to Z, where P, —> P. If there is an 
infinite set of points of E among the P,, P is a limit point of Z, and 
hence, by the closure process, appears in Z. Now let all the P,, as from 
a certain n, not belong to &. By hypothesis, they belong to Z, so that 
they are limit points of Z. In any e/2-neighbourhood of a point P 
there is an infinite set of points Pn, and in any ¢/2-neighbourhood of 
each point P, there is an infinite set of points of @. It follows at once 
from this that any e-neighbourhood of the point P contains an infinite 
set of points of Z, i.e. P is a limit point of Z, and must therefore belong 
to & by virtue of the closure process. 

We have thus shown that the set Z obtained by closure of any given 
set @ is necessarily a closed set. It may be remarked that the whole of 
the plane is simultaneously an open and a closed set. We do not 
associate the points at infinity with the plane. Every finite set of 
points is a closed set. It can never have limit points. 


86 SET FUNCTIONS AND THE LEBESGUF INTEGRAL {32 


We now bring in the concept of the distance between two sets. The 
distance between sets 7, and @, is the strict lower bound of the dis- 
tance from all possible points of 2, to points of %,. If the sets have a 
common point, the distance between them is zero. But the distance 
between sets can also be zero when they have no common points. 
Points of two sets that have no common points may in fact be 
indefinitely close. This cannot be the case if the sets are bounded and 
closed, and we proved the following theorem in Volume IT: if 7, and 2, 
are bounded and closed sets with no common points, the distance d 
between them is positive, and at least one pair of points, P of 2, and 
Q of č, can be found such that PQ = d. It follows at once from the 
proof of this theorem that it also holds when only one of the given 
closed sets is bounded. In particular, the distance of any given point 
of an open set from the boundary of this set is positive. 


32, Properties of closed and open sets. We shall now prove some 
special properties of closed and open sets. 

THEOREM 1. The sum of a finite or denumerable number of open sets 
is an open set. The product of a finite number of open sets is an open set. 

We take the sum of a finite or denumerable number of open sets: 


Z = 3 On- 
n 


If P € Z, then P belongs to at least one of the On. Let P € Ox. 
Since O, is an open set, an -neighbourhood of P also belongs to Ox. 
This e-neighbourhood of P also belongs to the sum &, whence it follows 
that & is an open set. We now take the finite product 


m 
g = HU Ons 
n=1 


and let P belong to &. We show, as above, that an e-neighbourhood 
of P also belongs to 2. Since P belongs to &, P belongs to all the O; 
(k = 1, 2, ..., m). Since the O; are open sets, there exists for any Ox 
an ¢,neighbourhood of P belonging to O,. If the number e is taken 
equal to the least of the e; (k = 1, 2, ..., m), the number of which is 
finite, the e-neighbourhood of P will belong to all the Op, and con- 
sequently to Z. Notice that it is not permissible to assert that the 
product of a denumerable number of open sets is an open set. 

THEOREM 2. The set CF is open and the set CO is closed. 

Let us prove the first assertion. Let P belong to CF. We have to 
show that an e-neighbourhood of P belongs to CF. This follows from 


32] PROPERTIES OF CLOSED AND OPEN SETS 87 


the fact that, if there were points of F in any -neighbourhood of the 
point P, P, which does not belong to F by hypothesis, would be a 
limit point of F and, since F is closed, must belong to F, which implies 
a contradiction. 
THEOREM 3. The product of a finite or denumerable number of closed 
sets is a closed set. The sum of a finite number of closed sets is a closed set. 
Let us show, for instance, that the set 


č = Ti Fn 
n 
is closed. On passing to the complementary sets, we can write [30] 
CZ = DS CF,,. 
n 


By Theorem 2, the CF, are open sets, and by Theorem 1, the set C% 
is also open, so that its complementary set & is closed. Notice that 
the sum of a denumerable number of closed sets may not be a closed set. 

THEOREM 4. The set O — F is an open set and F — O is a closed set. 
The following equations are easily verified: 


O—F=0-CF; F-O=F.-CO. 


Theorem 4 is a consequence of these, in view of the previous 
theorems. 

We shall say that a set & is covered by a system M of sets if 
every point of belongs to at least one of the sets of system M. 

THEOREM 5 (Borel). If a closed bounded set F is covered by an infinite 
system a of open sets O, we can extract from this infinite system a finite 
number of open sets which also cover F. 

We use reductio ad absurdum, i.e. we assume that there is no finite 
number of open sets of system a that covers F and hence arrive at a 
contradiction. Since F is a bounded set, all the points of F belong to 
some finite two-dimensional interval 4, (a < æ < b; c < y < d). We 
split this closed interval 4, into four equal parts, by halving the intervals 
[a, bj and [c,d]. Each of these four intervals will be taken to be clesed. 
The points of F which fall into one of these four intervals will form a 
closed set by virtue of Theorem 2, and at least one of these closed sets 
cannot be covered by a finite number of open sets of the system a. 
We take the closed interval of our four for which this is the case. We split 
this interval into four equal parts and repeat the argument. We thus 
obtain a system of embedded intervals A), 4, 4,, ..., each successive 
member of which is a quarter of the preceding one, and the following 


88 SET FUNCTIONS AND THE LEBESGUE INTEGRAL {32 


holds good: the set of points of F belonging to 4, cannot be covered 
by a finite number of open sets of the system a for any k. As k increases 
indefinitely, the intervals 4; shrink indefinitely to a point P, which 
belongs to all the A,. Since 4, contains, for any k, an infinite set of 
points of F, the point P is a limit point of F, i.e. P belongs to F, since 
F is a closed set. The point P is therefore covered by some open set O’ 
belonging to the system a. An e-neighbourhood of P will also belong 
to the open set O’. With sufficiently large values of k, the intervals 4, 
fall inside the above-mentioned «-neighbourhood of P. These A, will 
therefore be entirely covered by the single open set O’ of system a, and 
this contradicts the fact that the points of F that belong to 4, cannot 
be covered by a finite number of open sets of a for any k. The theorem is 
therefore proved. 

THEOREM 6. An open set can be expressed as the sum of a denumerable 
number of semi-open intervals, pairs of which have no common points 
(i.e. the intervals are non-overlapping). 

We recall that a semi-open interval on the plane is a finite interval 
defined by inequalities of the forma > x > b; c>y >d. 

We draw a net of squares on the plane with sides parallel to the 
axes and length of side unity. The set of these squares is a denumerable 
set. We choose those squares, all the points of which belong to the 
given open set O. The number of such squares may be finite or de- 
numerable, or there may be no such squares. Each of the remaining 
squares of the net is divided into four equal squares, and from the new 
squares obtained we again choose those, every point of which belongs 
to O. Each of the remaining squares is again divided into four equal 
parts, and the squares chosen, every point of which belongs to O, 
and so on. We show that every point P of the set O falls into one of 
the chosen squares, all the points of which belong to O. In fact, let d 
be the positive distance of P from the boundary of O. When we arrive 
at squares whose diagonals are less than d, we can obviously say that 
P has already fallen inside a square, every point of which belongs to O. 
If the chosen squares are regarded as semi-open, any pair of them will 
have no common points, and the theorem is proved. The number of 
chosen squares must be denumerable, since the finite sum of semi-open 
intervals is clearly not an open set. On writing A, for the semi-open 
squares which we have obtained as a result of the above process, we 
can write 


O= >A,. (21) 
n=1 


33] ELEMENTARY FIGURES 89 


In the case of one dimension, i.e. a straight line, the following 
statement is readily proved: every open set on a straight line is the 
sum of a finite or denumerable number of non-overlapping open 
intervals. 

Everything that has been said in the last two sections is applicable 
to point sets on a straight line, in three-dimensional space and in gener- 
al in n-dimensional space. The only difference is in the definition of 
-neighbourhood and interval. In three-dimensional space an e-neigh- 
bourhood of a point P is a sphere with centre at P and radius £, and 
an interval is a rectangular parallelepiped, the ribs of which are parallel 
to the axes. A semi-open interval is defined by the inequalities: 
a < £ < bi; d < Y < by; as < 2z < by. In the case of a straight line 
an -neighbourhood of a point x, is defined by the inequality 7, — 
—egqe cate. 


33. Elementary figures. A fundamental role will be played in what 
follows by finite semi-open intervals, and for brevity we shall speak 
of these simply as intervals. Let G(4) be a non-negative, additive and 
normal interval function. Our problem is to extend it to a wider 
class of point sets whilst preserving all its previous properties. We 
shall call the sum of a finite number of non-overlapping intervals 


A, (k = 1, 2,...,m) an elementary figure. Using R to denote such 
an elementary figure, we can write 
m 
R= > Ak: (22) 
kal 


We can evidently use a different method to split this elementary 
figure into non-overlapping intervals: 


i 
R= >A. (23) 
k=1 
It is easily seen that we have for any two such subdivisions: 
m m 
SGA) = > G (dj). (24) 
k=] k=1 


To see this, we need only carry out a new subdivision of R, consisting 
of the product of (22) and (23), and recall the fact that G(A) is additive. 
Jt turns out that both the left and right-hand sides of (24) represent 
the sum of the values of G(4) for the intervals of the new subdivision. 
To obtain the left-hand side of (24), we only need to regroup the terms 


90 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [33 


of this latter sum that correspond to the sub-intervals which belong 
to the same 4,, whilst the right-hand side of (24) is got by carrying out 
the grouping for terms corresponding to sub-intervals belonging to the 
same k. Thus, if the elementary figure R is split by some method 
into non-overlapping sub-intervals, the sum of the values of the 
function G(A) for these sub-intervals has a completely determinate 
value, i.e. does not depend on the method of subdivision of R. This 
sum will be taken as the value of the function G(R) for the elementary 
figure R, i.e. 


m 
G(R) = SG (4y) (25) 
k=l 
with any subdivision of R into a finite number of non-overlapping 
intervals. We have thus extended the function G(4) very simply to 
elementary figures. By making use of (21), we could similarly have 
extended G(4) to all open sets. However, we shall adopt a rather 
different procedure. At the same time, open sets will play a fun- 
damental role in our treatment. In the present section, we shall 
consider some further simple properties of intervals and elementary 
figures. 

Notice that, if R,c R, then G(R,) < G(R,). This follows at once 
from the fact that G(A) is non-negative, if we make use of a division 
of R, into intervals such that sub-intervals having points in common 
with F; are wholly contained in R. Let 6, (k = 1, ..., p) be intervals 
which may overlap. If we produce the straight lines on which the 
sides of the ô, lie, we split the sum of intervals 6, into sub-inter- 
vals that have the following property: if two of them have a 
common point, they are entirely coincident. On reckoning super- 
imposed intervals as a single interval, we obtain an elementary figure 
R,, which is obviously the sum of intervals: Ro = S 6x, where we have 

k=1 
G(R) < X G (ôx), (26) 
k=l 
and the < sign holds if the value of the function G(A) is positive for at 
any rate one of the superimposed intervals. 

We now introduce a new concept which will be useful in later con- 
structions. Let A (a< x <b,c<y < d) be an interval and a a 
positive number. We call the interval defined by the inequalities 
ata<2z<b,c+a<y<d an a-compression of the interval 4, 


33] ELEMENTARY FIGURES 91 


and write it symbolically as “A. We define an a-expansion of A as the 
interval defined by a <x<b+a, c <y <d-+a, and write it symbolic- 
ally as A®. The differences 4 — 4 = © R and 4® — A = R” are 
elementary figures. Since G(4) is non-negative, we have 


G() R) = G(A) — G( A) > 0 and G(R) = G (A) — G (4) > 0, 
and it follows at once from the normality of G(A) that 
lim G(A)= lim G(A) = G (A). (26,) 


a—>+0 a—+>+0 
We now prove a lemma which will be needed later. 
Lemma. If the elementary figure R is covered by a finite or denumer- 
able number of intervals ô; (which may have points in common), then 


2G (ô) > G(R). F (27) 


The assertion of the lemma can be seen by inspection. We obtain a 
rigorous proof by using Theorem 4. Let £ be a given positive number. 
We split R into a finite number of non-overlapping intervals 
Ay (k = 1,2, ..., m), and subject each sub-interval 4, to an a- 
compression, the positive number a being fixed so small that the 
sum of the values of G(A) is > G(R) — e for the compressed inter- 
vals. On writing R, for the elementary figure equal to the sum of the 
compressed intervals, we can write 


G(R,) > G(R) —e. (28) 


Each of the intervals 6; appearing in the covering of R is subjected 
to an a,-expansion, the positive numbers a; being chosen so small that 


G (KH) < G (84) + GR - (29) 


The compressed intervals, the sum of which has given Ra, are made 
into closed intervals, i.e. we close each of these intervals. The sum 
of the closed intervals obtained (the number of which is finite) is a 
closed set F, where obviously F c R. If we exclude the boundary from 
the 6, i.e. two sides and one vertex, an open interval 6” remains. On 
recalling the extension of the intervals 6;, we can say that the open 
intervals 64°» cover the above-mentioned closed intervals, i.e. cover 
the bounded closed set F. For instance, let the number of intervals 
ô; be infinite. By Theorem 4, it is sufficient to take a finite number of 
the intervals 6 (k = 1, 2, ..., q) to cover F and hence cover Re. 

m 


t Z means 2, where m = œ or is > 1 and finite. 
k k=1 


92 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [34 


The sum of the intervals 8» (k = 1, 2, ...,q) is an elementary 
figure R’, where R.c R’, and consequently G(Ra) < G(R’). We also 
have, by (26) and (29): 

i c] ; Beary 9,1 
G(R) = Se (Ke) < SOW) +e Se, 
k=1 k 


k=l 


=l 


whence it follows at once that 


G(R) <G(R) < SGW) +5 


k=l 
so that all the more: 
G (Ra) < X G (ôk) + €. 
On comparing with (28), we obtain 
G(R)—e< Z G(ôkh) +e, ie. XG (8i) > G(R) — 2e. 


The sum on the left is independent of e, and in view of the arbitra- 
riness of e, we obtain inequality (27). Notice that the terms of the 
sums on the left of (27) are non-negative and finite, whilst the sum 
itself may be equal to (+). 

We shall often be concerned in future with sums of an infinitenumber 
of non-negative terms. If at least one of the terms of such a sum is 
equal to (+ °°), the total sum must be reckoned equal to (+), 
But, as we have just indicated, it may happen that all the terms are 
finite, whilst the sum is equal to (+ °°), i.e. the series is divergent. 


34, Exterior measure and its properties. We now use the function 
G(A) to associate any point set # on the plane with a non-negative 
number, which will be termed the exterior measure of the set. 

DEFINITION. Let a set E be covered by intervals A, (n = 1,2, ...), 
the number of which is finite or denumerable. The exterior measure of € 
is the strict lower bound of the values of the sums: 


Z G (A,) (30) 


for all possible coverings of Ë by intervals. We shall denote the exterior 
measure by the symbol | E |g, where the subscript indicates the function 
G(A) on which the definition of exterior measure is based. Thus, we can 
write for any covering: 


X G (4a) > |B lo and |Z |g = inf X G (4,). (31) 


34] EXTERIOR MEASURE AND ITS PROPERTIES 93 


If sums (30) are equal to (+°) for any covering, the exterior 
measure has to be regarded as equal to (+20). The exterior measure 
of a bounded set is always finite, since such a set can be covered by a 
single interval 4), and G(4,) is finite by hypothesis. Notice that an 
unbounded set & cannot be covered by a finite number of intervals, 
since we have agreed to take each interval as finite. Nevertheless, the 
exterior measure of an unbounded set may be a finite number. We 
shall now prove a number of theorems on exterior measure. 

THEOREM 1. If &’c g", then | Z’ |g < |2” |e. 

Every covering of &” is a covering of 2’, so that the lower bound 
of sums (30) for Z’ may be less than for E”, but can never be greater 
than for Z”, which is what we had to prove. 

THEOREM 2. The exterior measure of every elementary figure R is 
equal to G(R), i.e. | R lg = G(R). 

If R is divided by some method into sub-intervals 4x, these latter 
cover È and we get | R |o < G(R), by (25) and the definition of ex- 
terior measure as the lower bound of sums (30) forall possible coverings 
of R. We now prove the reverse inequality. If the intervals 4), cover R, 
we have 2 G(4;,) > G(R) by the lemma of the previous section, 


whence it follows immediately that | R |g > G(R). These two inequalities 
together give us | R |c = G(R). 

THEOREM 3. The exterior measure of the sum of a finite or denumerable 
number of sets is < the sum of the exterior measures of the individual 
sets, i.e. 


|2Fn| <2 leno. (32) 


We shall often use the single letter S in future to denote the covering 
of a given set. In this case, we write o(S) for the sum (30) for such a 
covering. Given a positive e, by the definition of strict lower bound, 
there exists a covering S, of the set Zn such that o(8n) < | En le + 
+ ¢/2". We take the intervals appearing in all the S, (n = 1, 2, ...). 
They effect a covering S for X Zn, and we obviously have for this 
covering: r 


o(8)= Sa(8,) < ¥|Frlo te Sox < DS lFalo +e. 


By the definition of strict lower bound: 
|Z En oS Z lZala +e, 





and, since ¢ is arbitrary, we in fact obtain inequality (32). Nctice 


94 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [34 


that the sum on the right-hand side of (32), or even an individual 
term of this sum, can be equal to (+ 0°). Since the terms are non- 
negative, their order is of no importance. Now that the exterior 
measure has been defined we have extended the function G(A) to 
all possible point sets on the plane, but, generally speaking, we have 
lost the additive property of this function. For it can be shown that, 
if pairs of the sets Z, have no common points, we can nevertheless 
have the < sign in certain cases in (32). Below, we shall distinguish 
the class of sets for which the exterior measure retains the property 
of additiveness. As a preliminary, we prove a further theorem on 
exterior measure. 

THEOREM 4. Every set Z can be covered by an open set O whose exterior 
measure differs by as little as desired from the exterior measure of @, i.e. 
if E is any set and e is any given positive number, there exists an open 
set O such that Z c O and |O |g < |Ë |g +e. 

If | Z |g = +, the inequality |O |g < |Z |g + is satisfied for 
any covering of the set by an open set. We shall assume in future 
that | Z |a is finite. Let e be a given positive number. We choose the 
covering of 2 by intervals 4, such that the inequality holds: 


DS G(4,) <|F le +. (33) 


Each of the intervals A, is subjected to an a,-expansion, and the 
positive numbers a, are chosen so that 


G (AS) < G (4) + sar - (34) 


If we exclude the boundary from AG”, i.e. two sides and one vertex, the 
sum of the open intervals AS” obtained gives some open set O, which 
is obviously covered by intervals AG”, We have by the definition of 
strict lower bound: 


[Ole < SG (Ay). 


We use (34) to write further: 
lOl < DG(4,) tF 


and finally, on taking (33) into account, we get 


lOle<|@le+zt+y=lFlete. 


35] MEASURABLE SETS 95 


35. Measurable sets. We now distinguish a class of sets which will 
be described as measurable, and for which we shall later prove that 
the exterior measure is additive. We shall refer to the exterior measure 
of a measurable set as simply the measure. 

DEFINITION. A set F is said to be measurable if it can be covered by an 
open set O such that the exterior measure of the difference O — @ is as 
small as desired, i.e. E is said to be measurable if, given a positive e, 
there exists an open set O such that E c O and |O—@ |a < e. The ex- 
terior measure of a measurable set will be referred to simply as the measure 
of the set. 

The requirement imposed in the definition of measurable set is 
stronger than the property stated in Theorem 4. This latter property 
holds for all sets, whilst, given a certain choice of G(A), there exist 
sets which are not measurable, i.e. which are not subject to the above 
definition. We can use the symbol | Z |g to denote the measure of a 
measurable set, since the measure of a measurable set coincides by 
definition with exterior measure. We prove next that any interval 4 
is measurable. Its exterior measure is equal to G(4) by Theorem 1. We 
shall see later than every elementary figure is measurable. We are 
therefore justified in simply writing G(@) for the measure of any 
measurable set &. We further agree to take the exterior measure, or 
simply the measure, of the empty set as zero. This is in accordance 
with our above definitions. We introduce one further definition: a set 
g is called a set of measure zero with respect to G(4), or simply a set 
of measure zero (since G(A) is assumed fixed), if | |g =O. It follows 
at once from this definition that every part of a set of measure zero is 
also a set of measure zero. We now prove a number of properties of 
measurable sets. These properties will serve as a basis for the whole 
of our future treatment. 

THEOREM 5. An open set is measurable. 

If is an open set, it can be shown to be measurable simply by 
taking O as coincident with Z. In this case | O — & |g =O. 

THEOREM 6. Any interval A is a measurable set, and its measure is 
equal to G(A). 

Let 4 be an interval. On subjecting it to an a-expansion, we obtain 
the interval A. The difference A“ — A is an elementary figure, 
and since G(A) is normal, we have G(A“ — A) = G(A™) — G(4) > 0 
as a— +0, i.e. given any positive ce, there exists an a such that 
GA — 4) <e. Let O be an open interval A, i.e. an interval 
which is obtained from A by the removal of its boundary. In view 


96 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [35 


of the expansion process, Ac O. To prove the measurability of A, it 
remains to show that |O — 4|g<«. We have Oc A, and by 
Theorem 1, we have |O—4A|g< | A= A le = G(A°— A) < e, 
i.e. |O—A |g < e, which is what we had to prove. The measure of 4, 
equal to the exterior measure of J, is the same as G(A), since J is a 
particular case of an elementary figure. 

THEOREM 7. The concept of a set of measure zero, i.e. a set with an 
exterior measure equal to zero, is the same as the concept of a measurable 
set whose measure is zero. 

If | Z |g = 0, there exists by Theorem 4 an open set O such that, 
given any positive e, Zc O and |Z |g < e, i.e. by Theorem 1, all the 
more | O — @ |g < e, ie. Z is measurable. The measure of & is equal 
to its exterior measure, i.e. is zero. Conversely, if Z is a measurable 
set and its measure is zero, by the definition of measure the exterior 
measure of & is also zero, and the theorem is therefore proved. 

THEOREM 8. The sum of a finite or denumerable number of measurable 
sets is a measurable set. 

Let En be measurable sets, their sum, & = z Zn, and e a given 
positive number. By the definition of measurable set, there exist open 
sets On such that nc On and | On — En | < «/2”. The sum of the 
open sets On is an open set O, where obviously & c O, and by (6) of 
[30], we have 

O-€c yd |0, — g. 
n 


By using Theorems 1 and 3, we obtain 
10 —@ le < | Z (On — En) lo < Z |0, ~ Fale 


or, since | On — Ën | < €/2": 
|O-@|g<e, 


which proves the measurability of Z. We now turn to a proof of the 
measurability of closed sets. We must first prove a lemma. 

Lemma. If the distance between two sets, and &, is given by a positive 
number, then | E1 + ® la = | File + | Flea. 

Let d be the positive distance between sets &, and @,. Given any 
positive £, there exists a covering S of the set J, + Z, such that 


o (8) < |Z + 8 |e +e. (35) 


We divide each of the intervals appearing in S into a finite number 
of intervals in such a way that the diagonals of all the intervals ob- 


35] MEASURABLE SETS 97 


tained are less than d. All the intervals of S are now divided into three 
classes: the first class contains the intervals which cover only points 
of Z, the second contains intervals which cover only points of %,, and 
finally, the third contains intervals that cover neither points of 2, 
nor points of &,. There are no intervals that cover both points of 2, 
and points of ,. We can simply throw the intervals of the third class 
out of the division of 8. The sum o(S) can now only diminish, and 
inequality (35) retains its force. We can therefore assume that the 
covering 8 is split into coverings S, and S,, where the intervals of 8, 
cover &, and have no common points with &,, and the intervals of 8, 
cover &, and have no common points with @,. We have a(S) = 
= o(8,) + o(8,) and, by (35), 


a (84) + € (82) < |E + Flo +e- (36) 


It follows from the definition of strict lower bound that | 1 le < 
< o(S8,) and | Z, |g < o(9,), and inequality (36) leads to the inequality 
[File + |Zzla < | Z1 + 82 le +e, whence, since e is arbitrary, we 
have | Z; le + | S2le < |Z, + & |e. On the other hand, Theorem 3 
gives | Zi + @.|¢ <|%, le +|1%2 le. We therefore obtain |, + 
+ Z; le = | Z1 la +142 |e, which proves the lemma. 

COROLLARY. If F, and F, are two disjoint closed sets at least 
one of which is bounded, then | Fi + Fala = |F le t+| Fle- 
If Fy (k = 1,2, ..., m) are pairwise disjoint closed bounded sets, then 


m m 
| X F.| = X | Fx |g. This corollary is proved simply by using what 
k=1 k=1 


was said in [32] regarding the distance between closed sets with no 
common points. 

THEOREM 9. Closed sets are measurable. 

We first suppose that F is a bounded closed set, and let e be a given 
positive number. By Theorem 4, there exists an open set O such 
that Fc O and |O|g < | F |o + £. We show that this open set O 
will in fact satisfy the inequality 


|O— Fjo <e, (37) 


which appears in the definition of measurable set. By Theorem 3 of 
[33], the difference O — F is an open set, so that, by Theorem 5, 
it can be written as the sum of a denumerable number of non-overlap- 
ping intervals 4,: 


O-F= X4, (38) 


n=l 


98 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [35 


Having fixed the positive integer m, we consider the sum of the first 
m terms of sum (38), where each interval appearing in this sum is 
subjected to an a-compression by choosing the positive number a in 
some given way. We thus obtain an elementary figure R: 


R= SA, 
n=1 


If we close each interval ‘4, (n = 1,2,...,m), the sum written 
gives us a closed set, which obviously coincides with the closure R 
of the elementary figure R. Each closed interval ‘A, is covered by a 
corresponding interval 4, forming part of sum (38). Hence the closed 
set & has no points in common with F, so that the distance between 
these sets is positive. The distance between R and F will be all the more 
positive, and we have by the lemma: 


5 A, 


n=1 





5, +r = Le ales 


n=l 











But, by (38), 5 4, + Fc oO, so that 


n=l 


Say 
= n 








On taking into account the inequality | O |g < | F |o + £, we obtain 
from this: 


+|Fla<[Flete. 





5045]. 
a 
= n 





By hypothesis, F is a bounded set, so that | F |g is a finite number. 
The last inequality leads us to 


<E. 
G 


m 
NOA, 


n=l 








But 





So, | =G (R) = $64, = 1A, Io, 


n=1 n=l 


and the previous inequality can be rewritten as 


m 
J | Anle KE 


n=} 


35] MEASURABLE SETS 99 


On first letting a tend to zero, then m to infinity, we get the in- 
equality 


Finally, it follows from (38) and Theorem 3 that 
|O-Flo< Xl4nlo <e, 
n=1 
i.e. inequality (37). Now let the closed set F be unbounded. Let y, be 


a closed circle with centre at the origin and radius n. We can form 
the closed bounded sets F, = F - yn, and write 


A 


F = 


n 


Ep 


j 
m 


and the measurability of F follows at once from Theorem 8. Theorem 9 
is thus proved. 

TurorEM 10. If Z is a measurable set, the complementary set OZ is 
measurable. 

Since & is measurable, there exist open sets On such that 2c On 
and | On — Ë |g < 1/n. We construct the closed sets Fn = COn. Since 
Zc On, we have F c OZ, and in addition, by (19) of [30], we can 
write the equation: CF — F, = On — Z. On replacing Fn on the left- 
hand side by the sum of all the Fn, we get 


of — SF, cC8—F,, is. OF — SF,cO,-6, 


nal n=l 


and, since | On — Ë |e < 1/n, we have 


l 
<n ` 
G 


|z- $r, 


n=l 





The left-hand side of the inequality written is independent of n, and 
by letting n tend to infinity, we arrive at the equation 


=0, 
G 








ce — SF, 
n=1 


from which it follows that the difference on the left is a set J, of mea- 
sure zero. We can thus write CZ as a sum of measurable sets: 


CF =F, 4+ SF, 
n=l 


100 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [35 


whence it follows, by Theorem 8, that CZ is measurable. According 
to the definition, the measurability of a set is established with the aid 
of open sets. We show in the next theorem that the measurability of 
a set can be similarly established with the aid of closed sets. 

THEOREM 11. The necessary and sufficient condition for & to be 
measurable is that, given any positive e, there exists a closed set F such 
that Fc € and |E — F\g <e. 

The measurability of Z is equivalent to the measurability of CZ, 
and the necessary and sufficient condition for this is that, given any 
positive e, there exists an open set O such that CF c O and | O — CE |g 
< e. If we put F = CO and notice that, by (19) of [80], O — CZ = 
=& — C0 = g — F and Fc g, we obtain the assertion of the 
theorem. 

THEOREM 12. The product of a finite or denumerable number of measur- 
able sets is a measurable set. The difference between two measurable sets is 
a measurable set. 

If the sets Zn are measurable, the measurability of their product 
follows directly from the formula [30]: 


1l En =€ XCé,, 


and Theorems 10 and 8. If A and B are measurable, the measurability 
of their difference follows at once from the formula of [30]: A — B = 
= A+CB and the meagurability of the product. 

THEOREM 13. The measure of the sum of a finite or denumerable number 
of pairwise disjoint measurable sets is equal to the sum of the measures 
of the individual sets. 

Let %, be pairwise disjoint measurable sets. The measurability of 
their sum follows from Theorem 8. We suppose first that each of the 
-Ë n is bounded. By Theorem 11, given any positive e, there exist closed 
sets Fn such that Fac %, and |En — Fn la < ¢/2”, where the Fn 
are obviously bounded and pairwise disjoint. The formula 
En = Fn + (Zn — Fn) implies at once that 


|ala < | Fale + or: 
On the other hand, by considering the first m of sets F,, we find that 


< 


m 
DF, DË, and consequently 
n G 


n=1 





SF, 
n=l 








en 





G 


35] MEASURABLE SETS 101 


We can apply our above lemma to a finite sum of pairwise disjoint 
closed sets Fn, and hence obtain, on also making use of the inequality 
| Fa la > | En la — 2”, 


m m m e 
Din > > lFala> PALA > oa 
n G n=1 n=l n=l 


Let us take the most complicated case, when the number of sets 
Zn is infinite. On indefinitely increasing the number m in the last 
inequality, we have 








oo 


DË 
1 


n= 








> > IF, le E 
G n=1 
or, since € is arbitrary: 
PAL 


n=1 


> > |F lo- 
G n=l 








On comparing this inequality with (32), we arrive at the equation: 





Sz,| = S| Zlo (39) 
n=l G n=1 





which proves our theorem. In view of the measurability of %,, and 
their sum, we can write the last formula as 


G(B, 4+ 8, 48,4...) =O (By) +O (&) +6) +... (40) 


We now take the case when the @,, include unbounded sets. Let yn 
be a closed circle with centre at the origin and radius n. We take the 
sets 

ZP = Fay EP = Fnr) EP = Fa (ys — 2); --- 

Each of them is bounded, and they are all measurable, since the 
closed set y, and the difference between closed sets yk — yk- are 
measurable sets, whilst the product of measurable sets is also measur- 
able. We can write each of the 7, as a sum of pairwise disjoint bounded 
measurable sets: 


Z, = Sse, 
k=1 
and, by what has been proved, we have 


(Znalo = S/F |c. (41) 
k=1 


102 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [35 


The sum @ of sets E, can be rewritten as a double sum of pairwise 
disjoint bounded sets Z, certain of which may be empty: 


Z= > Ss. 


n=l k=l 


By what has been proved, we have 
[Fle= 5 SPa 
n=l kel 


Since the terms are non-negative, their order is of no importance 
[I; 134]. We shall sum first over k, then over n. On using (41), we thus 
arrive at (39) again, and the theorem is fully proved. 

Note. If we dispense with the assumption that no pair of the &n 
has common points, we have for the sum of the 2,, which is measur- 
able by Theorem 8, 

Gf) < S4GB,). (40,) 
n=l 

This follows at once from Theorem 3 and the fact that the exterior 
measure is simply the measure for measurable sets. If the measure is 
zero for all the 2, (40) gives G(Z) < 0. But the measure cannot be 
negative, therefore G(#) = 0, i.e. the sum of a finite or denumerable 
number of sets of measure zero is a set of measure zero. 

All the theorems proved above are also valid for measurable sets 
of infinite measure. We have to make a proviso in this connection in 
the last theorems. 

THEOREM 14. If A and B are measurable, Bc A, and B is of finite 
measure, then 


G(A — B) =G(A)—G(B). (42) 
The difference A — B = D is measurable by Theorem 12. We have 
A = B + D, where B and D have common points. By Theorem 13, 


G(A) = G(B) + G(D), and subtraction of the finite number G(B) 
from both sides gives us (42). 


THEOREM 15. If Zn (n = 1, 2, ...) is a non-decreasing sequence of 
measurable sets, the limit set is measurable and 
G (g) = Uin G (En). (43) 
The measurability of Z follows at once from the formula 
Z= lim Z, = 8, + (Z — Z) + (Z8) +... (44) 


Np 


36] MEASURABLE SETS (CONTINUED) 103 


The terms on the right have no common points, and if all the %, 

are of finite measure, we have 
G(#) =G (@,) + [G (@,) — C (23) + [4 (23) — € (F,)]. 

The sum of the first n terms on the right is equal to G(Zn), i.e. (43) 
follows from the last formula. If one of the 2, is of infinite measure, 
the limit set is all the more of infinite measure, and (43) is obvious. 
Notice that the value (+°°) is permissible in this formula, both 
for G(@,) and for G(@). 

THEOREM 16. If J, (n = 1, 2,...) is a non-increasing sequence of 
sets of finite measure, the limit set Z is measurable, and (43) holds. 

We write Z, as a sum of pairwise disjoint sets: 


$= + (Z1 — Za) + (Ea — Fs) + (E3 — E) H. (45) 


The measurability of Z follows from Theorems 8 and 14. On applying 
Theorems 13 and 14 to (45), we get 


G (8) = G(8) $ [6 (2; — G (82)] + [6 (27) — G (23) ka 
+ [6 (23) — G(%)] + - 


G (E1) = G(%)+ G(é,) — lim G@,), 


Noe 


i.e. 


whence (43) follows. 
Note. The measurability of the limit set @ follows from (45) 
without the assumption that the @, are of finite measure. 


36. Measurable sets (continued). The above theorems on measurable 
sets have a number of useful corollaries. An elementary figure & 
is the sum of a finite number of intervals, i.e. is a measurable set, and 
its measure (which is the same as its exterior measure) is given by (25), 
where A, (k = 1, 2, ...,m) is some division of E into non-overlap- 
ping intervals. Let Lo denote a family of measurable sets, where 
the subscript G indicates the function G(4) used as a basis for forming 
the family. We have extended G(4) to all the sets & belong to Lg, 
the function G(@) obtained being non-negative and, by Theorem 13, 
additive for a denumerable as well as a finite number of disjoint sets 
Z. Let n be a vanishing sequence of sets, belonging to Lg and 
having finite measure, i.e. 1D 8,5 D ..., and the limit set 
Z of the @, is the empty set. It follows at once from Theorem 
16 that Gn) > O, ie. the function G(#) is not only non-nega- 
tive and additive, but is normal for the family of sets Lg. In order 


104. SET FUNCTIONS AND THE LEBESGUE INTEGRAL [36 


to underline its additiveness for a denumerable as well as finite number 
of sets appearing in Lg, we shall call this function completely additive. 
The family Lg also contains unbounded sets. Certain of these may be 
of finite measure, whilst the measure of the others may be (+°). 
But evidently, not every unbounded set needs to be measurable. 
Often, when forming a family of measurable sets, only bounded sets 
are considered, or even sets belonging to a definite finite interval. We 
shall not subject ourselves to this limitation in our future treatment. 
A further point is that the initial function G(4) is assumed to be de- 
fined for all finite intervals. If G(4) is defined only for intervals A 
belonging to some interval 4), it can naturally be extended to all 
intervals A by using the formula G(4) = G(4 + 4), remembering that 
a product of intervals is also an interval. 

The family of sets Lo depends on the choice of initial function G(4). 
But whatever the choice of this function, it always contains all inter- 
vals, elementary figures, open sets and closed sets. We shall give later 
a fuller characteristic of the sets which belong to Lg for any choice 
of G(A). We shall interpret the set function as a mass. Specifying 
the original function G(4) amounts to specifying the mass on some 
interval 4, the usual conditions for non-negativeness, additiveness 
and normality being obviously fulfilled. A point set & is measurable 
if it is meaningful to speak of the mass located on 2, and G(@) is this 
mass. 

We can give a simple example of when the set Lg contains all point 
sets of the plane. Let mass 1 be concentrated at the point P. Here, 
G(A) = 1 if the interval A contains P, and G(4) = 0 if A does not 
contain P. It is easily shown that the family Lg for such a function 
G(A) contains all sets, where G(@) = 1, if Z contains P, and G() = 0 
if does not contain P. 

Let us take the important particular case when G(4) is equal to the 
area of the interval A. The family Lgo will simply be written as L for 
this case. Here we have an extension of the concept of area for the 
wide family of sets L. It was this particular case that was considered 
first by the French mathematician Lebesgue. The function G(@) will 
be written as m(&) for this case. The family of sets L is usually known 
as the family of sets which are Lebesgue measurable. It is meaningful 
to speak of an area for such sets. If Ẹ is a finite or denumerable set of 
points, m(@) = 0. Similarly, if Z is a segment or the whole of a straight 
line, m(@) = 0. If we take the same interval as semi-open, open or 
closed, m(A) has in every case the same value. If a measurable set & 


37] CRITERIA FOR MEASURABILITY 105 


has interior points, obviously m(@) > 0. It can be shown that there 
exist bounded open sets such that m(Z) > 0, where J is the boundary 
of a set (l is closed and therefore measurable). For an open set, m(O) 
is the sum of the areas of the intervals which appear in (21), this sum 
being independent of the method of representing O as a sum of inter- 
vals. If F is a bounded closed set, on covering it with an open interval 
Ay, we can define m(F) as the difference between the values of two 
open sets, viz. m(F) = m(4,) — m(A, — F). 

The whole of the construction of family Lg can be performed pre- 
cisely as above in any finite-dimensional space. In particular, the family 
L in three-dimensional space is the family of sets having a definite 
“yolume”, whereas in one dimension it is the family of sets having a 
definite “length”. Instead of this, for spaces with a finite number of 
dimensions, we often speak simply of the measure of the set, if it 
belongs to L. 


37. Criteria for measurability. Various definitions, equivalent to the 
one above, can be given of measurable sets. We shall indicate some of 
these definitions, confining ourselves for the moment to bounded sets. 

THEOREM 1. The necessary and sufficient condition for a bounded set 
& to belong to the family Lg is that, given any positive e, there exists an 
elementary figure R such that 


+e =R+e, (46) 
where we have the inequalities for sets e, and e,: 
lerlg <e and lelg <e. (47) 


Necessity. Let Z belong to Lg. There now exists an open set O 
such that Zc O and |O — Z |g < e. On writing O — Ë = e, we 
have O = & + e, and inequality (47) holds for e,. On the other hand, 
by Theorem 6 of [32], O is the limit of an increasing sequence of 
elementary figures Rp, where 2, is the sum of the first n terms on the 
right-hand side of (21). By Theorem 15, we have G(O) = lim G(R,), 

n— œo 


so that we can take so large an n = m that, on setting R = Em, we 
have O = R + e, where lezla <œ. On comparing both the ex- 
pressions obtained for O, we arrive at (46), where inequalities (47) 
are fulfilled for e, and e,. Let us prove the sufficiency. Given any e, 
(46) and inequalities (47) hold. Since Ẹ is measurable, there exists an 
open set O, such that Rc O, and | O, — E |g < £. On the other hand, 
by Theorem 4 [34], there exists an open set O, such that e,c O, and 


106 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [38 


| Oz la < | ezla + £, or, by (47), we have |O, |c < 2e. The open set 
O = 0, + 0, covers Ë + e,, and we have 


0—8 c[0- E7 +e) +e 
or, by (6) of [30]: 


O — Z c [(0, + 02) — (R + e2)] +e, C(O, — B) + (0, — &) +e. 


On observing that | O, — R |e < e |0; — & |a < | O, |a < 2c and 
(47), we obtain from this: | O — @ |g < 4e, so that Z is measurable, 
in view of the arbitrariness of e. 

THEOREM 2. The necessary and sufficient condition for a bounded set 
& to belong to Lg is that, given any positive e, there exists an elementary 
figure R such that 


|Z — R|o <£ and |R— Ele <E (48) 


Let us prove the necessity. Let Z belong to Lg. There exists an 
elementary figure R such that we have (46) and (47). Inequalities (48) 
follow from the obvious relationships  — RC e, and R— EC e,. 
Let us prove the sufficiency. Given any e, let there exist an R for 
which inequalities (48) are satisfied. If we put  — R = e, and 
R—& =e, we get Z +e = R +e, where e, and e, satisfy in- 
equalities (47), and the measurability of & follows at once from 
theorem 1. 

THEOREM 3. The necessary and sufficient condition for a set E (which 
may be unbounded) to belong to Lg is that, given any positive e, there 
exist an open set O and a closed set F such that 


Fc@cO and |O—-FligK<e. (49) 


If Z is measurable, by the definition and Theorem 11 [35], there exist 
an F and O such that Fc O, |E — F |o < 2 and |O — Z |e < 
< 1/2e. We have further: O — F = (O — &) + (@ — F), whence (49) 
follows. Conversely, let (49) hold. Now, all the more | O — % |g < € 
and @ is therefore measurable by the definition. 

A further criterion for measurability will be given without proof. 
The necessary and sufficient condition for & to be measurable is that, 
for any choice of set A, we have 


|Alg=|4-®le+|4—@ lc. (50) 


38. Field of sets. We bring in a new concept regarding families of 
point sets, a family of sets being understood to mean a system of sets 


38) FIELD OF SETS 107 


(a set of sets). We define a field of sets as a family of sets with the 
following properties: 

(1) if the sets Z, and &, appear in the family, their difference J, — f, 
appears in the family; 

(2) if Z, and g, have no common points, and appear in the family, 
their sum 2, + Z, belongs to the family. 

The following are immediate consequences of this definition. The 
empty set, which is the difference between two identical sets belonging 
to the field of sets, must belong to any field of sets. Further, it follows 
at once from the formulae [30]: 


Z2, = 8, — (8, — &,); Zi + = E t (8, E 


that the product of two sets belonging to the field also belongs to the 
field, and the sum of two sets belonging to the field belongs to the field, 
even when the sets have common points. This statement can evidently 
be generalized to any finite number of factors or terms, i.e. the sum 
and product of a finite number of sets belonging to the field belongs 
to the field. 

We shall strengthen the requirement of the second part of the de- 
finition of a field of sets by requiring that the sum of a denumerable 
number of disjoint sets belonging to the field belongs to the field. 
Such a field of sets is described as closed. Thus, a closed field of sets 
is a family of sets with the following two properties: 

(1) if sets Z; and &, appear in the family, their difference 7, — &, 
appears in the family; 

(2) if the family contains a finite or denumerable number of disjoint 
sets pn, it also contains their sum. Precisely as above, it can be seen 
that a closed field of sets contains any finite sums and products of 
sets appearing in it. Let us show that a closed field of sets contains 
the sums and products of a denumerable number of sets appearing 
in it. To see this, we write down the following two formulae: 


SF, = 8,4 (2—8) + [% — (Z, +2] + 


n=! 
+ [%,—(F,+ 8,4 %)] +... (51) 
TL @n=2, — (8, — 8) + (Z, — Za) + 
n=l 
+(@,—-8)+...]. (52) 


The proof of these formulae presents no difficulty. It is sufficient to 
verify that every element (point) appearing in a set on the left-hand 


108 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [38 


side appears in a set on the right-hand side, and vice versa. Let &, 
appear in the closed field of sets T. The terms on the right-hand side 
of (51) are now disjoint, and appear in T. Consequently, by the defi- 
nition of the closed field T, the sum of sets %, also appears in T. 
The terms in the square brackets on the right-hand side of (52) 
appear in T, so that their sum also appears in T. Thus the whole of 
the right-hand side, i.e. the product of sets 7, appears in T, which is 
what we had to prove. 

It follows at once from the theorems proved in [36] that Lg is a 
closed field of sets. We consider the interval function G(4), which 
we have extended to the clored field Lg. The family of intervals 
does not represent a field, since the difference between two intervals 
may not be an interval. The family of elementary figures R is a field, 
though not a closed field. Our process for extending G(A) consisted 
in first extending G(A) to the field of elementary figures, then to the 
closed field Lg. The function G(@) proved here to be non-negative, com- 
pletely additive and normal in Lg in the sense indicated in [37]. 
Let us explain the connection between the concepts of normality and 
additiveness for a set function. 

Let T be a field, which may be non-closed. The function G(@), 
defined for all sets appearing in T, is said to be completely additive 
in T when the following condition is fulfilled: if the set Z belonging 
to T is the sum of a finite or denumerable number of sets n that also 
belong to T and are mutually disjoint, then 


GG, +8, 4+...) =G6(F) + E(B) +... 


The concept of completely additive function has already been men- 
tioned. There is a direct connection between the concepts of com- 
pletely additive and normal functions, which is expressed by the 
following theorem: 

THEOREM. The necessary and sufficient condition for a function G(Z), 
defined on a field T and taking only finite values, to be additive and 
normal is that it be completely additive. 

If the function is additive, it follows at once that, given AC B, 
G(B — A) = G(B) — G( 4). Let G(@) be additive and normal, and 
let us show that it is completely additive. Suppose that & is the sum 
of a denumerable number of pairwise disjoint sets 2, (n = 1, 2,...). 
We can write 


€=6,+...4+8,+|[F -—(8,+ 8,4 ...+4,)] 


39) INDEPENDENCE ON THE CHOICE OF AXES 109 


and, since the function is additive, 
G(@)=G(F,) +... +G(F,) + G[F —(@, +... + €,)]. (53) 


But  — (@,+ ... + Zn) is a vanishing sequence. We pass to the 
limit in equation (53), taking account of the normality: 


G(€)= lim (G6) +...+68)] = S66) 
n> Asi 


This shows that G(@) is completely additive. Now suppose, converse- 
ly, that G(@) is completely additive; we show that it is normal. 
Let {> > ... be a vanishing sequence. We have to show that 
meres We can write 


where the terms in brackets are disjoint; hence we have, since G(@) is 
additive: 


6 (8) = 0 (Zi) — [6 (i — Fi) + 
+088) + +0] (54) 


On the other hand, it follows from the formula 
=> (Fi — Ekt) 
that G(Z) is completely additive: 


G (Fj) = XG (Ek Zin) = lim YG (Zk — Ek), 


n= k=1 


> 
pa 


and a comparison with (54) shows that G(%;) —> 0, which is what we 
wanted to prove. Above, we extended a non-negative, additive and 
normal interval function G(4) to a closed field Lg, the function 
G(@) thus obtained being completely additive. It can be shown that 
no other extension of G(A) to Lg is possible, given complete additi- 
vity. 


39. Independence of the choice of axes. Some remarks must be 
made regarding the independence of the measure on the choice of axes. 
The original function G(A) was defined on semi-open rectangles, 
whose sides are parallel to the X and Y axes. 

The solid Lg contains semi-open rectangles on the plane with any 
direction of side, since every such rectangle is the difference between 


110 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [40 


a closed rectangle and the closed set of points composing two sides 
and three vertices of the rectangle. The function G(Z) is therefore 
defined, in particular, on all semi-open rectangles A’, whose sides are 
parallel to some other Cartesian system of axes X’ and Y’, G(A’) 
being additive and normal on these rectangles. If, after choosing the 
new axes X’ and Y’, we start from G(A’) and extend it as indicated 
above, we arrive at a field Lg. It may easily be shown that this field 
is the same as Lg, and that, with the new extension that we perform 
by starting from G(A’), the same values of G(@) are obtained on all 
intervals as were obtained with the previous extension, which was 
performed by starting from G(4). This assertion is based on the fact 
that every open set O can be written either as the sum of intervals A,, 
no pair of which has common points, or as the sum of Ak, no pair of 
which has common points, where 


G(0) = SG(4,) = SC(dy. 
kal k=l 

It follows from this, by Theorem 4 of [34], that the exterior measures 
of any set in the two coordinate system are the same. It further 
follows from the definition of measurability that a set will be simul- 
taneously measurable in both coordinate systems, i.e. the field Lg. is 
the same as the field Lg. The coincidence of the measures on both 
axes follows at once from the fact that, by the above-mentioned 
theorem, G(@) is the strict lower bound of the measures of the open 
sets covering &. A further remark: if G(4) is the ordinary area of the 
rectangle A, i.e. the Lebesgue measure m(A), then G(A’) is the ordinary 
area of A’ [cf. II, 92]. 


40. The B field. As we have indicated, a closed field Lg depends 
on the choice of function G(A). We shall next indicate a closed field 
such that a set appearing in it belongs to any closed field Lg and, in 
particular, belongs to L. We take all possible closed fields T such that 
every T contains all closed intervals, and we form the family of sets B, 
consisting of sets belonging to all the above-mentioned closed fields T. 
It may easily be seen that the family of sets B is also a closed field. For, 
if Z, and &, belong to B, they belong to all our closed fields 7, i.e. 
the difference 7, — @, belongs to all the T, and hence to the family 
B. The second part of the definition of closed field may be proved 
similarly. The closed field B is therefore the common part of all the 
closed fields that contain all possible closed intervals. This closed 


40] THE B FIELD Ill 


field B evidently appears in the composition of each Lg, since this 
latter also contains all closed intervals. Every open set has been seen 
[33] to be expressible as the sum of a denumerable number of closed 
intervals, so that the closed field B contains all open sets. Any closed 
set F is the complement of some open set O, i.e. can be expressed as 
the difference between the entire plane (an open set) and the set O, 
i.e. the B field also contains all closed sets. The field of sets B was 
first considered by the French mathematician Borel (before Lebesgue). 
The sets belonging to the B field are sometimes called B measurable 
sets or Borel (measurable) sets. 

A different definition to the above can be given of the B field, viz. 
we reckon that a set Z belongs to a B field if it can be obtained from 
closed intervals with the aid of the following two operations, applied 
a finite or denumerable number of times: 

(1) formation of the sum of a finite or denumerable number of sets 
already constructed; 

(2) formation of the product of a finite or denumerable number of 
sets already constructed. This definition requires certain explanations, 
but we shall not dwell on these. We shall also omit the proof that the 
new definition of B field is equivalent to the previous. We shall 
conclude the present section by proving two simple theorems. 

THEOREM 1. If & is any set of Lg, there exist two sets Z, and Z, 
belonging to the B field (and hence to the field Lg), such that 


E CE C, and G(@,)=G(F,) =G (8). (55) 


We know that there exist, for a set J belonging to Lg, closed sets Fn 
and open sets O, such that 


F,c&cO, G(—F,)<—+; G(0,—8)<—. (56) 
The sets Fa and O, belong to the B field. Consequently, by the 


definition of closed field, we can assert that the sum of sets F, and 
the product of sets O, belong to the B field: 


B1 = Bai er [I On (57) 


Recalling that Fac Ec On, we can assert that Zc Fc ĉn 
Moreover, € — Zc @— Fn and €,—& c O,— Z, so that 
GZ — @,) < Ifn and G(@, — Z) < Mn for any n. The left-hand side 
is independent of n, so that the last inequality leads to equation (55), 
and the theorem is proved. It can be stated as follows: every set € of 


112 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [41 


Lg can be included between the sets of B that have the same measure as 
the given set Z. 

A family of sets, which will be useful later, can be distinguished 
from the B field. 

DEFINITION. A set & is said to be a Gs set if it is an open set or the 
product of a denumerable number of open sets. 

We notice first of all that, as mentioned above [32], the product 
of a denumerable number of open sets may or may not be an open set. 
It follows at once from the definition of G, sets that the product of a 
finite or denumerable number of G, sets is also a G, set. Let us show 
that every finite closed interval A (a < æ < b,c < y < d) is a G; 
set. In fact, we can write it as the product of open sets 4, (a — En < 
< g< b -F EnC — en < Y <d + En), where n is a sequence of 
positive numbers tending to zero. It may readily be shown that every 
closed set is a G, set. We shall not require this in future. The following 
assertion is an immediate consequence of the theorem proved above: 

THEOREM 2. Every measurable set E can be covered by a set H of type 
G; such that Gi?) = G(A). 

Notice also that, if E belongs to the closed interval A, the covering set H 
can be chosen in such a way that it belongs to A. For, if H is a set G, 
covering & and satisfying the condition G(Z) = G(H), the set H’ = 
= HA will also be a G, set, covering and satisfying the condition 
G(@) = G(H’), where H’ obviously belongs to 4. 


41. The case of a single variable. The theory of measure takes a 
simpler form in the case of one variable. As we know, a non-negative, 
additive and normal function of a semi-open interval reduces to a non- 
decreasing point function g(x): 


G (A) =G ([a, 6}) = g (6 + 0) — g (a + 9). 


Starting from this function G(4), as indicated above, we form a 
set function G(%), defined for all sets belonging to Dg. The value 
of G(Z) for a set F belonging to Lg is sometimes called the variation 
of g(x) on the set Z. If g(x) = x, we get Lebesgue measurable sets, 
and G(@) is a generalization of the concept of length for such a set. 
If g(x) is defined only in some interval, it can be extended to the entire 
axis, as indicated above. 

We introduce instead of x the new variable ¢ in accordance with 


t =g (x), 


42] DEFINITION OF MEASURABLE FUNCTION 113 


the meaning of this change of variable being as follows. If g(x) is con- 
tinuous at a point z, the corresponding value ¢ is defined by (58). 
Whereas if v is a point of discontinuity, the closed interval [g(x — 0), 
g(x + 0)] of the variable t is taken as corresponding to it. The semi- 
open interval (a, b] of the variable x becomes, with this correspond- 
ence, the semi-open interval (g(a + 0), g(6 + 0)] of the variable ¢; 
the latter interval degenerates to a point when g(b + 0) = g(a + 0). 
Tf ex are sets of the x axis and e, are the corresponding sets of the ¢ axis, 
it can be shown that the exterior measure of e, with respect to g(x) is 
equal to the exterior measure of e, in the Lebesgue sense, i.e. evaluated 
on condition that the length of a semi-open interval is taken as basis. 
In the case of a single variable, an elementary figure is the sum of a 
finite number of semi-open intervals having no common points, and it 
can readily be shown that, if ex is measurable with respect to g(x), e, is 
Lebesgue measurable, the measure of ex with respect to g(x) being 
equal to the Lebesgue measure of the set e. 


§ 2. Measurable functions 


42. Definition of measurable function. The problem of this and the 
following sections is the construction of a certain class of functions and 
the investigation of the properties of the functions. A more general def- 
inition of integral will be given later on the basis of this class of func- 
tions. We shall assume in our treatment that the function G(4), on 
which the theory of measure is based, is fixed in some manner, i.e. we 
shall be concerned with a definite field Lg. This may be e.g. the field 
of Lebesgue measurable sets L. Let a point function f(P), that 
takes real values, be given on the measurable set &. These values may 
be finite or infinite, i.e. f(P) can take the values (+) and (—°) 
as well as finite values. We introduce the following notation. We 
write [f > a] for the set of points of F at which f/(P) > a. Similarly, 
Z[f < a] means the set of points of Z at which f(P) < a. If {(P) and 
g(P) are two functions, the symbol 2[f = g] denotes the set of points 
of Ë at which f(P) = g(P), and so on. 

DEFINITION. A function f(P), given on a measurable set J, is said to be 
measurable if, given any real a, the sets 


Z[/ >a]; Z[/ <a]; Z/>a]; Z <a] (1) 


are measurable. We first prove the following theorem: 


114 SET FUNUTIONS AND THE LEBESGUE INTEGRAL [42 


THEOREM 1. A sufficient condition for the measurability of sets (1) 
for any a is that one of the sets be measurable for any a. 

The sets [f >a] and Z[f <a] are complementary, and the 
measurability of one of them for any a is equivalent to the measura- 
bility of the other. Similarly, the measurability of the third of sets (1) 
is equivalent to the measurability of the fourth. Let us show say that 
the measurability of the third set for any a implies the measurability 
of the remaining sets. In fact, the measurability of the third set im- 
plies the measurability of the fourth, and of the set [f > a], which 
can be written as 


Zy>a= jE j>- r] 


so that the second set is also measurable. 
We notice also that the sets [f = +o] and @[f = —œ]} can be 
written as 


Sa +o]= JE 4>]; Elf=- = E< -a 


Actually, it is sufficient to prove the measurability of (1) only for 
rational values of a. For, every irrational a can be written as the 
limit of a decreasing sequence of rational a,, and the measurability of 
Z[/ > a] follows directly from the formula 


Flf>al= SFlf>a,|. 


We shall give a number of simple properties of measurable functions, 
following directly from the above definition. 

THEOREM 2. If f(P) is measurable on 2, it is measurable on any 
measurable part &’ of the set Z. If f(P) is measurable on a finite or de- 
numerable number of pairwise disjoint sets En, it is measurable on the 
set E representing the sum of the En. 

These statements follow directly from the formulae: 


B' [f>a]=F[f>a]-8; Flf>a]= SZ, [f>a}. 


THEOREM 3. If Z is a set of measure zero, any function f(P) is measur- 
able on this set. 

For, given any a, the set [f > a] is part of the set Z, which has 
measure zero, i.e. the set S[f > a] has measure zero, ie. f{(P) is 
measurable. 


42] DEFINITION OF MEASURABLE FUNCTION 115 


DEFINITION. Two functions f(P) and g(P), defined on u set Z, are 
said to be equivalent on this set or simply equivalent, if the set E[f +g] 
has measure zero. We prove the following theorem on equivalent 
functions. 

THEOREM 4. If {(P) and g(P) are equivalent functions on a measurable 
set Z, and one of them is measurable, then the other is measurable. 

By hypothesis, the set [f # g] = Ais a set of measure zero. On the 
measurable set 7’ = & — A, we have /(P) = g(P). The measurability 
of f(P) on & implies the measurability of f(P) on %’, so that g(P) is 
also measurable on &’. By Theorem 3, the function g(P) is measurable 
on the set A. Hence, by Theorem 2, g(P) is measurable on the set = 
= &’ + A, and the theorem is proved. 

It is easily shown that, if f is equivalent to g,, and f, equivalent to g, 
then fı + f is equivalent to g, + g» /,f, is equivalent to g, gp, and 
hilf, is equivalent to g,/9,, provided the relevant operations have a 
meaning almost everywhere. 

If two continuous functions are equivalent in the sense of the Lebesgue 
measure on some interval or throughout the plane, it is easily seen 
that their values are the same at every point. For if we were to have 
say {(P,) — g( P.) > 0 at some point, this inequality would be retained, 
by virtue of the continuity of the functions, in some sufficiently small 
-neighbourhood ô of Py, where m(é) > 0, and this contradicts the 
definition of equivalent functions. 

Let us quote some simple examples of measurable functions. Let 
{(P) be continuous in a finite closed interval 4). Given any a, we take 
the set A, [/(P) > a] and show that it is closed. It will then follow 
immediately that it is measurable, so that f( P) is a measurable function. 
If Pa (n = 1, 2, ...) is a sequence of points having a limit point P, 
and /(P,) >a, then f(P) >a by virtue of the continuity of the 
functions, and this shows that the set A, [/(P) > a] is closed. Similarly, 
if f{(P) is continuous throughout the plane, it is measurable. For, if 4, 
is any closed interval, the set 4, [/(P) > a] is measurable, as we have 
just shown. The limiting set will also be measurable on extension 
of Ay. This limiting set isthe set of all the points ofthe plane where 
[(P) > a. 

Now let /(P) have a point of discontinuity P). We cover it by a 
sequence of open sets 4, (n = 1, 2,...), which shrink indefinitely 
to Po. Outside 4, the function f(P) is continuous and the set e, of the 
points P where f(P) > a is closed. As n increases, the sets e, do not 
decrease and tend to the measurable set e. We must also add P, to 


116 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [42 


this set, if f(P,) > a, and thus obtain the set of all points at which 
J(P) > a, this set being measurable in view of the above. The same 
arguments apply in the case of a finite number of points of discon- 
tinuity, i.e. a function with a finite number of discontinuities is 
measurable. 

The following statements will be given without proof: if /(P) takes 
finite values on the closed interval 4, and the set of its points of 
discontinuity has measure zero, f(P) is measurable on 4, But this 
condition for measurability is merely sufficient. An example is easily 
given, in which every point of a set & is a point of discontinuity, yet 
the function is measurable. We take the function f(x), defined on the 
interval [0, 1] as follows: f(x) = Oifzisarational number, and /(z) = 
= 1 if v is irrational. We take the Lebesgue measure, i.e. the case 
when G(4) is the length of an interval. The measure of any point is now 
equal to zero. The rational points of the interval [0, 1] form a denumer- 
able set, and in view of the fact that the measure is completely additive, 
the set of rational points also has measure zero. The given f(x) differs 
from a function identically unity throughout the interval only on the 
set of rational points having measure zero, i.e. f(x) is equivalent to a 
function identically equal to unity, and f(x) is measurable by Theorem 
4. But it is easily seen that every point x, of the interval [0, 1] is a 
point of discontinuity of f(x). For, there are both rational and irrational 
values of x in -neighbourhood of @p, i.e. f(x) takes both the value 0 
and the value 1 in any e-neighbourhood of x = 2p, so that x, is a point 
of discontinuity. We shall indicate in [45] the deep bond between the 
concepts of measurability and continuity. 

We shall also consider the so-called piecewise constant function on 
a measurable set Z, ie. the function f(P) which takes a finite or 
denumerable number of values cy (k = 1, 2, ...) on &. If the sets Ey 
on which f(P) = cp are measurable, it follows at once from the de- 
finition of measurability that the piecewise-constant function /(P) 
is measurable on Z. Let us give a further example. Let /(P) be measur- 
able on the measurable set Z. Suppose that it is zero on the complement 
CZ. The function thus formed is measurable on & and on C@, i.e. by 
Theorem 2, is measurable throughout the plane. 

To mention a further case of a single variable, let g(x) be a non- 
decreasing function on which the measure is based [42], and f(x) a 
measurable function. It is sometimes said in this case that f(x) is 
measurable with respect to g(x), whilst if g(x) = 2, we simply say 
that f(x) is measurable. 


43] PROPERTIES OF MEASURABLE FUNCTIONS 117 


43, Properties of measurable functions. Certain further properties 
of measurable functions are worth mentioning. 

THEOREM 1. If f(P) is a measurable function, | {(P) | is also a measur- 
able function. 

This follows at once from the formula: 


&[|f|>e]=F[f>a]+e[f< —al]. 


THEOREM 2. If f(P) is a measurable function and c is a finite constant, 
different from zero, c + f(P) and cf(P) are measurable functions. 
The first assertion follows at once from the formula: 


Z [e+ f(P) >a] =8[f(P) >a —c], 


and the second from the formulae: 
Z [of (P) > a] =8(f(P) > ~| for c>0, 
8 [of (P) >a] = |f(P) < | for <0. 


THEOREM 3. If f(P) and g( P) are measurable functions, the set Ff > g] 
is measurable. 

We enumerate all the rational numbers: 7,, 7, ... The measurability 
of the set of the theorem follows at once from the formula 


Z> g] = Se [>r] <r: 


THEOREM 4. If {(P) and g(P) are measurable functions taking finite 
values, the functions f — g, f + g, fg and fjg (when g # 0) are measur- 
able. 

The measurability of f — g follows at once from 


&[f—g>a]=2[f>a+g] 


and Theorems 2 and 3. The measurability of the sum follows from 
f +9 = f — (-—g) and Theorem 2 with c = —1. The measurability of 
the square f? of the measurable function f follows at once from 


é[P >a] =2[f> Va] + 2 [t < — Va], 
whilst the measurability of the product fg follows from 


tg = [+9 -—¢—9)]- 


118 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [44 


We use the following formulae to prove the measurability of 1/g, on 
condition that g does not take the value 0: 


8|— >a] =# [g>0]-2|9< —] for a> 0; 
e[—-> a|=2[g>0]+8[9<—| for a <0; 
6|—>a]=%[g>0] for a=0. 


Finally, the formula //g = /(1/g) implies the measurability of the 
quotient. It is necessary to make the proviso in this theorem that f(P) 
and g(P) take finite values at every point of &. Otherwise, the ope- 
rations on these functions may become meaningless. 

If, say f = +a at some point and g = —œ, we could not speak 
about the sum f + g at this point. If there is no such indeterminacy 
in performing the operations on f and g, infinite values are per- 
missible for {(P) and g(P). For instance, the following theorem may be 
proved. 

THEOREM 5. If {(P) and g(P) are measurable functions taking finite 
values and the value (+œ), the function f + g is measurable. 

Let A be the set on which at least one of the function is equal to 
(+). This set is measurable by virtue of the measurability of f and 
g, and the sum f + g has the constant value (-+ °°) on the set A, i.e. 
is measurable. Both the functions f and g have finite values on the set 
g’ =@— A, and by Theorem 4, the sum f + g is measurable on 2’. 
It is therefore measurable on Z = 2’ + A, which is what we had 
to prove. 


44, The limit of a measurable function. We investigate in this 
section passage to the limit for measurable functions. Our fundamental 
result will be that a passage to the limit for measurable functions 
leads to another measurable function. Some preliminary facts in 
connection with limits must be given. Let 


Ay, Ay, Ay, --- (2) 


be a sequence of real numbers, which may possibly include (-+-°°) 
or (—co), Let s, denote the strict lower bound of the set of numbers 
[@n, an4 ---] and é, the strict upper bound of this set, i.e. 


8, = inf [an ango ---]3 bn = sup [an any ---]- (3) 


44] THE LIMIT OF A MEASURABLE FUNCTION 119 


As n increases, this number set is impoverished, i.e. s, does not 
decrease and f, does not increase. Hence, as n increases indefinitely, 
the monotonic sequences s, and tn have finite or infinite limits: 

lim s,=8; limi, =T, (4) 


T->°0 næ 
where, in view of the monotonicity, 
S=sups,; T = inft,, (5) 


and, in addition, s, < în implies S < T. As regards the sequence 
(+), (+), ..., we assume that its limit is (++ °°), and similarly for 
(—-co). The number S called the lower limit of sequence (2), whilst T is 
the upper limit of the sequence. 
The following symbols are often used: 
S = lima, or 8 = lim -inf ap; 


Noo 


T =lima, or T=lim-supa,. 
ne 

Let us prove the following lemma. 

Lemma. The necessary and sufficient condition for the existence of a 
limit (finite or infinite) of sequence (2) is that S = T, and if this con- 
dition is fulfilled, the limit is equal to 8. 

We first prove the sufficiency. We have sn < a, < tn for k > n, and 
if the limits of s, and tn are the same, i.e. S = T, obviously a, — 5S. 
We now prove the necessity. Let the sequence (2) have a finite limit ø. 
All the numbers a, now lie in the interval (o — e, ø + e) for suffi- 
ciently large n, € being any given small positive number. Hence this 
interval contains all the s, and tn for sufficiently large n. It follows, 
since ¢ is arbitrary, that s,—> o and t, > o, i.e. S = T = ø. The case 
of an infinite limit of sequence (2) is similarly considered. We now prove 
some properties of sequences of measurable functions. 

THEOREM 1. If [,(P) is a sequence of measurable functions, the strict 
lower and strict upper bounds of the values of f,(P) at any point P of set g 
are also measurable functions, i.e. the functions 


p (P) = inf fr, (P) and p(P) = SUP fn (P) (6) 


are measurable. 
Let us prove that say o(P) is measurable. If we have g(P) < a at 
the point P, at least one of the values of f,,(P) is also < a, and con- 


120 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [44 


versely, if at least one of /,(P) < a, then »(P) < a. We therefore have 
Z [p (P) <a] = S2 lf, (P) <a], 
n=1 


whence it follows, since the /,(P) are measurable, that g(P) is measur- 
able. 

THEOREM 2. If fa(P) is a sequence of measurable functions, monotonic- 
ally increasing (or monotonically decreasing) at every point P of the set 
Z, the limit function f(P) ts also measurable. 

This follows at once from the previous theorem, since the limit 
function of a monotonically increasing sequence of functions is the 
same as its strict upper bound (P), and the same as the lower bound 
g(P) for a monotonically decreasing sequence. 

TBEOREM 3. If f,(P) is a sequence of measurable functions, the lower 
limit S(P) and upper limit T(P) of the sequence are also measurable 
functions. 

We introduce the functions 

Sn (P) = inf [fn (P), fn4i (P),. eF bn (P) = sup Ee (P), frst (P),. sal: 

They are measurable for any n by Theorem 1. The functions S(P) 
and T(P) are the limits of monotonic sequences s,(P) and #,(P), so 
that, by Theorem 2, they are also measurable functions. 

THEOREM 4. If f,(P) is a sequence of measurable functions, convergent 
at every point P of a set &, the limit function f(P) is also measurable. 

The measurability of f(P) follows from Theorem 3, since the limit 
}(P) must be the same as S(P) and T(P) when it exists at every point. 
This last theorem is fundamental for what follows, and we shall 
generalize it to some extent. 

We say that a property holds almost everywhere on @ if it holds at 
all points of except for a set of points of measure zero. 

THEOREM 5. If f,(P) is a sequence of functions measurable on &, and 
convergent almost everywhere on @, the limit function f(P) is measurable. 

Notice that the limit function /(P) may not be defined on some 
part A of the set 2, where A is of measure zero. We define f(P) on A 
in any manner. The sequence f,,(P) is convergent at every point of the 
measurable set 7’ = & — A, and, by Theorem 4, f(P) is measurable 
on &’. In addition, it is measurable on A by Theorem 3 of [42]. Con- 
sequently /(P) is measurable on the set = Z’ + A, and the theorem 
is proved. 

We introduce the new concept of the convergence of a sequence 
of functions. 


44] THE LIMIT OF A MEASURABLE FUNCTION 121 


DEFINITION. Let f,(P) and f(P) be functions measurable on & and 
taking finite values. We say that the sequence f,(P) is convergent in 
measure to f(P) on & if, given any positive e, the measure G(@,) of 
the set En of points at which the inequality | f(P) — f,(P) | > « holds, 
tends to zero on indefinite increase of n. 

The connection between convergence in measure and convergence 
almost everywhere is established in the next two theorems. 

THEOREM 6. Let E be a measurable set of finite measure and f,(P) a 
sequence of functions measurable on E, which take finite values almost 
everywhere on & and are convergent almost everywhere on & to the function 
{(P), which also takes finite values almost everywhere on &. Now, fanl P) 
are convergent in measure to f(P) on &. 

Let £ be a given positive number. We introduce the set of points Zn: 


Za =g [I (P) —f,(P)| > e]. 


We have to show that G(%,) > 0. We introduce the set of points 
at which f(P) and fn (P) take infinite values, and the set on which f,(P) 
does not tend to f(P): 


A=8[f(P)|=+~0]; 4,=8[|f(P)|=+ e]; 
B=&[f,(P) does not-> f (P)]. 
By hypothesis, all these sets are of measure zero. The same can be 


proved for their sum [36]: 
C=A+ YA,+B, 


n=1 
i.e. G(C) = 0. If P, does not belong to O, fa(Po) and f(P,) have finite 
values, and /,(P,) > f(P>). We introduce the sets 


R, = X2, and S= J] Rn- (7) 
k=n n=l 
The sequence R, (n = 1, 2, ...) is a non-increasing sequence of sets 


of finite measure, since ¢ has finite measure, and S the limit set for En, 
so that 
G (En) > G (8). (8) 


We show that Sc C, i.e. that, if P, does not belong to C, then Po 
does not belong to JY. In fact, if P, does not belong to C, fa(Po) and 
/(P») are finite and f,(P,) > (Po), i.e. there exists an N such that 
(Po)— fn (P)| <e for n >N. Hence it follows that P, does not belong to 


122 SET FUNOTIONS AND THE LEBESGUE INTEGRAL [44 


Zn for n > N, i.e. P, does not belong to R, for n > N, so that P, 
does not belong to 8S. Hence Sc C. But G(C) = 0, so that G(S) = 0, 
and by (8), G(R,,) > 0. But, by the first of formulae (7), Enc Rn, ie. 
all the more G(Z,,) > 0, which is what we wanted to prove. 

Note. We can associate the set C with all the Zn. Since G(C) = 0, 
we again have G(%,,) —> 0 after such addition, whilst | f(P) — fP) | < 
< eat every point of the set (Z — Zn). Convergence in measure does 
not necessarily imply convergence almost everywhere, but the follow- 
ing theorem holds. 

THEOREM 7. Let Z be a measurable set of finite measure, f,(P) and 
}(P) be measurable on E, where f,(P) are convergent in measure to f(P) 
on &. There now exists a subsequence fn,(P) that tends to f(P) almost 
everywhere on Z. 

We choose a sequence of positive numbers 6, (k = 1, 2, ...) such 
that 6,— 0 as k— œ, and a sequence of positive numbers ex such 
that the series £; + £, + ... is convergent. In view of the convergence 
in measure, there exists an indefinitely increasing sequence of sub- 
scripts ny such that G(%,) < ex for the sets Z; = ET | HP) — fa,(P) | > 
> 6,]. We introduce the sets 


R,= 38 S=J[TR- 


k=n n=1 


It is easily shown that G(S) = 0. For 


G(R) < SAB) < Se 


k=n k=n 


and the last sum — 0 as n-» œ by virtue of the convergence of the 
series & + e+... We now show that f,(P)—/(P) on the set 
Z — S. Since G(S) = 0, this will prove the theorem. 

Let the point P, € Z — S and hence P, € S. Hence it follows that 
P, does not belong to Ep for all sufficiently large k, so that P, does 
not belong to @, for all sufficiently large k, i.e. there exists an N such 
that P€ Zp for k>N. On recalling the definition of Ep, we get 
| AAP) — fal Po) | < ôx for k > N, whence it follows that fa (Po) > 
—> f(P,), since ôk —> 0 as k—> co, 

Note. It might obviously be assumed that, as in Theorem 6, 
fh(P) and f(P) are only finite almost everywhere on @, and that 
fn(P) is convergent in measure to f(P) on the set remaining after 
exclusion of the sets A and A, from @. 


46] PIECEWISE CONSTANT FUNCTIONS 123 


There is a theorem connecting convergence almost everywhere 
with uniform convergence. This theorem was proved in 1911 by 
Egoroff. We shall merely state the theorem, since we make no use 
of it in future. 

THEOREM. Let & be a measurable set of finite measure and f,(P) 
a sequence of functions measurable on E, which take finite values almost 
everywhere on E and are convergent almost everywhere on & to the 
function f{(P), which also takes finite values almost everywhere on @. 
Now, given any positive £c, there exists a closed set F belonging to 
& such that G(F — F) < e and the convergence f,(P) > f(P) is 
uniform on F. 


45. The C property. It can be shown that the measurability of a function is 
equivalent to another property — the C property, which is defined with the aid 
of the concept of continuity. 

We must first introduce some new concepts. 

A function f(P), defined on a closed set F, is said to be continuous at @ point 
P, of this set if, given any positive £, there exists a positive y such that | f(P,) — 
—f(P)| < e, if P € F and belongs to an 7-neighbourhood of the point Po. 
The function f(P) is said to be continuous on the closed set F if it is continuous 
at every point of F. Notice that, by virtue of our definition of continuity, any 
function is continuous at an isolated point P, of a set, i.e. at a point, an €- 
neighbourhood of which contains no points of F except P, A similar definition 
can be given of continuity on any (not necessarily closed) set. We now introduce 
a further concept. l 

DEFINITION. We say that a function f(P), defined on a measurable set &, 
has the C property on this set if, given any positive e, there exists a closed set F 
belonging to E such that, firstly, G(E — F) < e, and secondly, f(P) is continuous 
on F. 

The equivalence of the C property and measurability was established by a 
theorem proved by Luzin in 1913. 

THEOREM. If a function f(P) is defined on a measurable set E of finite measure 
and has finite values almost everywhere on €, the necessary and sufficient condition 
for this function to be measurable is that it has the C property on ë. 

We shall make no use of this theorem and shall not dwell on the proof. 


46, Piecewise constant functions. We now define a class of functions 
that are often used in theoretical investigations. 

DEFINITION. A function f(P), defined on a measurable set E, is 
said to be piecewise constant on this set if it takes only a finite or 
denumerable set of values on @. 

Let cx (k= 1, 2,...) be the different values taken by /(P) on @, 
which may possibly include (—co) and (+œ). For /{(P) to be 
measurable, it is obviously necessary and sutficient that the sets of 


124 SET FUNCTIONS AND THE LEBESGUE INTEGRAL (46 


points %, on which {(P) is equal to cg be measurable for all k [42]. 
We shall in future take into consideration only measurable piece- 
wise constant functions. 

We bring in a new concept. If 2’ is a set of points, the characteristic 
function of this set is defined as the function w,(P), defined through- 
out the plane, such that w,(P) = 1 if P belongs to ’ and w,(P) = 0 
if P does not belong to Z’. A piecewise constant function f(P) is a 
linear combination of characteristic functions: 


f (P) = X wa (P), (9) 
k 


where P belongs to Z. Since the Z, have no common points (the cp 
are different numbers), only one term in the sum written differs from 
zero except in the case when the cy corresponding to the chosen point 
P is zero. All the terms are zero in this latter case. 

Obviously, the characteristic function w,{P) is measurable when 
and only when 2’ is a measurable set. 

We next prove the possibility of obtaining measurable functions as 
the limits of piecewise constant functions. We shall confine ourselves 
here to non-negative functions. 

THEOREM 1. Given any function, non-negative, bounded and measur- 
able on a measurable set Z, there exists an increasing sequence f,(P) of 
non-negative measurable piecewise constant functions on & with a finite 
number of values, which is uniformly convergent to f(P) at every point of Z. 

Since f(P) is bounded, a positive number L exists such that 0 < 
< f(P) < L. We divide the interval [0, L] into 2” equal parts by the 
points 

take (k=1,2,..., 2-1). 


We bring in the measurable sets 
1 L 
zm = g [ks <f(P)<(k +1) -| 
and construct a sequence of functions /,(P) as follows: 


fa (P) = be 


=, if PE sy. (10) 


It may easily be seen that the sequence /,(P) satisfies all the re- 
quirements of the theorem. Each of the /,(P) takes a finite number of 
values on @. Further, on passing from n to n + 1, each interval 


[tae +H] 


46] PIECEWISE CONSTANT FUNCTIONS 125 


is split into two: 





[2k (2k + sr 


and 





[+o (A+A. 


so that each of the sets J” is split into two sets: 
ep = By + EHP. 


On the set 29+” the function f,4,(P) is equal to the same number 
kL/2 as the function f,(P) on the whole of the set g {”, whilst on the set 
Zut) the function fn4,(P) is equal to 


L L 
koa + oar 


i.e. the sequence of /,(P) is increasing. Further, we have on any set 


a; 


h (P) =k 
and 
kgr <le <(k+ 1) 3 
Thus 


O</(P)—h(P)< 


at every point P belonging to 2, whence it follows that the sequence 
fn(P) tends to f(P) uniformly on 2. We consider in the next theorem 
the case when (P) may be unbounded. 

THEOREM 2. Given any function f(P), non-negative, finite and measur- 
able on the set E, there exists an increasing sequence of }f,(P), non- 
negative and piecewise constant on E, such that it tends to f(P) uniformly 
on &. 

In this case we subdivide the infinite interval [0, +-co) with the aid 


of the points 
k 


Tk = On 


(k=1, 2, 3,...). 
We again define sets Sf) = Z[(k/2") < f(P) < (k + 1)/2"] and 


functions 


h (P =% if PE sy. (11) 


126 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [47 


It can be shown, precisely as in the previous theorem, that the 
sequence /,(P) satisfies all the requirements of the theorem. In the 
case of an unbounded function /(P), the /,(P) will take a denumerable 
number of values. If we relinquish the uniformity of the convergence 
we can confine ourselves to piecewise constant functions with a 
finite number of values. In addition, we shall assume in the next 
theorem that (+2) is a possible value of f(P). 

THEOREM 3. Given any non-negative function, measurable on a 
measurable set Z, there exists an increasing sequence g,(P) of non- 
negative piecewise constant functions on & with a finite number of finite 
values, which tends to f(P) at every point of Z. 

In addition to the sets ZĘ? of the previous theorem, we also form 
the set Zo = F[f(P) = +] and introduce the sequence of functions 
Pn(P) as follows: 

Pn (P) = fa (P), if fn(P) <n 


and 9,(P) = n if f,(P) > n or P € Z, It is easily seen that the pn(P) 
satisfy all the requirements of the theorem. We shall have occasion 
to use these theorems later. 


47. Class B. We mentioned in [41] a closed field of point sets, 
such that every set belonging to this field appears in any field of sets 
Lg. Similarly, we shall now indicate a family of functions such that 
any function of the family is a measurable function for any choice of 
measurable function G(4). 

DEFINITION. A function f(P), defined on a set E, which is B measur- 
able, is described as a B-function if the sets 


Z[/ >a]; S[f<a]; Sf>a]; F|f<al] 


are B measurable for any real a. 

It follows at once from this definition that every B function is 
measurable for any choice of G(A). Another definition of B function 
can be given, completely analogous to the definition of B measurable 
sets that we gave in [41]. We take all possible families of functions 
possessing the following two properties: firstly, the family contains all 
functions continuous on @, and secondly, if the family contains a 
sequence of functions /,(P) convergent at every point of &, the family 
also contains the limit function. A family of B functions is a family 
of functions that belong to all families of functions with the two pro- 
perties indicated. We shall not dwell on the proof that this last 
definition is equivalent to the previous one. 


48] THE INTEGRAL OF A BOUNDED FUNCTION 127 


Certain details regarding the last definition must be mentioned. 
Every continuous function is a B function, and such a function is 
usually said to belong to the zero class. If /(P) is the limit function of a 
sequence of continuous functions convergent at every point of @, /(P) 
itself being discontinuous, such an /(P) is said to belong to the first 
class. Every function of the first class is also a B function. If f(P) 
is the limit function of a sequence of functions of the first class, 
convergent at every point of J, where f(P) itself is not of the first class, 
we say that /(P) belongs to the second class. Every function of the 
second class is also a B function. The subsequent classes of function 
are similarly defined, and all the functions of these classes are B 
functions. Let us proceed further with the construction. Let f(P) be 
the limit function of a sequence of functions f,(P), each f,(P) being a 
member of some class with a finite index (number), whilst /(P) 
itself does not belong to a class with a finite index. We now say that 
f{(P) belongs to the class of functions with transfinite index œw. Every 
function of this class is a B function. We can further define a class of 
functions with transfinite index (w + 1), and so on. All the B functions 
can be obtained by this method. This assertion requires a supplementary 
discussion of transfinite numbers, which we must omit. 

It can be shown that every function f(P), measurable on a set @, 
which is B measurable, is equivalent to some B function ọ(P) on 
this set. 


§ 3. The Lebesgue integral 


48. The integral of a bounded function. We shall now give an 
ordinary definition of the integral of a bounded function by using a 
subdivision into all possible measurable sets, and show that every 
bounded measurable function is integrable. Let f(P) be a given 
bounded T function on a measurable set 2 of finite measure, i.e. 
we have |f(P)| < L on Z, where L is a positive number. We subdivide 
Z into a finite number of pairwise disjoint measurable subsets p: 


k=1 


Let m, and Mp be the strict lower and strict upper bounds of the 
values of f(P) on Zp. We form the usual sums 


= Sm (Zi); = SMG Bx), (2) 


128 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [48 


where 6 denotes the subdivision (1) of set . The sums s, and S; are 
bounded for any subdivisions, in fact | s8;| and | 8,| < L- G(@). 
Further, let ¢ be the strict upper bound of the sums s, and J the strict 
lower bound of the sums S, for all possible subdivisions of into a 
finite number of measurable sets. 

DEFINITION. If i = I, we say that f(P) is integrable with respect to 
G(@) on the set E, and the value of the integral is taken equal to i: 


i = f f (P) G (dé). 
č 


The integral thus defined is called a Lebesgue-Stieltjes integral. 
If ô is the subdivision (1) and 6’ is any other subdivision: 


nw 
E (3) 
k=1 


the product 66’ of the subdivisions is defined as the subdivision con- 
sisting of all possible subsets Z, gi. These sets are obviously dis- 
joint. Certain of them may in fact be empty. Subdivision (3) is 
said to be an extension of subdivision (1) if every set Fj is part of one 
of the x. If 6, is an extension of 5,, we write ô, > ô.. 

In addition to sums (2), we form the sum 


o= S HP) G (En) (4) 
k=1 


as for the Stieltjes integral, where Px is a point of Zp. Everything that 
was said in [3] holds for ss, Ss, os, ¢ and J. 

We next indicate a sequence of subdivisions for any bounded func- 
tion /(P) measurable on & such that S — s — 0, i.e. os has a definite 
limit. It follows from this that the integral 7 of f(P) with respect to 
G(@) exists, and that ss, S; and o, tend to i for the sequence in ques- 
tion [3]. 

Suppose, then, that /(P) is a bounded function, defined and measur- 
able on @, and let m and M be the strict lower and strict upper bounds 
of the values of {(P) on &. We divide the interval [m, M] of variation 
of the function into sub-intervals by the points yx: 


M = Yy LY LY << -o < Yn < Yn =M, (5) 


and let 7 be the greatest of the differences y, — Y,-,. We define the 


48] THE INTEGRAL OF A BOUNDED FUNCTION 129 
following subdivision ô of the set into measurable subsets @,: 


2, =F [n <) <y] F=f [ur <f (2) < yd (6) 
(k = 2, 3,..., n). 


It follows at once from this definition of sets Ex that y,-, < Mg and 
Yk > My, ie. 


X Y0 (Fx) < Sa < Sa < Sy G (Ea) (7) 
f= Past 


and all the more 
n n 
> 1G (Fr) <7 STS Shah). (8) 
k=l k=1 


We consider the difference between the extreme sums: 


n 


> 4G (Fr) — Snt) = Se — yy) OB): (9) 
k=1 


k=1 k=l 


Noticing that y, — Yk-ı < 4 and that G(%) is additive, we get 
n n 

0 < ShO) — X yh (Fi) < nG (2), (10) 
k=1 k=l 


so that the difference in (10) tends to zero as y-» 0. Hence, by (7) 
and (8), it follows that i = I and 8, — s,-» 0. Subdivision (6) of the 
basic set Z into subsets %, is called a Lebesgue subdivision. It is de- 
fined by the subdivision (5) of the interval [m, M] of variation of the 
function /(P). The sums of (7) and (8), corresponding to subdivision 
(6), are called Lebesgue sums. The above discussion leads to the follow- 
ing fundamental theorem. 

FUNDAMENTAL THEOREM. A bounded measurable function f(P), given 
on a measurable set Ẹ of finite measure, is integrable on %, and the value 
of the integral is equal to the limit of the Lebesgue sums or the sums a, 
for any choice of points Py of the Lebesgue subdivision, on indefinite 
refinement of the subdivision of the interval [m, M] of variation of f(P). 

Notice the familiar fact that the sums o, have the same limit for 
any extensions of the subdivisions mentioned in the fundamental 
theorem. Since the integral is defined in the usual way, as the limit 
of sums ca, it retains the usual properties of the Riemann integral and 
the classical Stieltjes integral. We shall prove these properties in the 
next section. 


130 SET FUNCTIONS AND THE LEBFSGUE INTEGRAL [49 


We have termed the above integral a Lebesgue-Stieltjes integral. 
It is called simply a Lebesgue integral in the particular case when 
G(A) is the area of the interval A. 

We have seen that every bounded function with a finite number of 
discontinuities is measurable. Let f(P) be such a function, given on 
a finite closed interval 4. We know that such a function is Riemann 
integrable over the interval 4. As a bounded measurable function, it 
is also Lebesgue integrable. Let us show that the Lebesgue integral is 
the same as the Riemann integral. In fact, the Lebesgue integral can 
be obtained simply by taking some sequence of subdivisions of A into 
measurable sets for which the sum (4) has a definite limit, which in 
fact gives the value of the Lebesgue integral. But since the function 
is Riemann integrable, the subdivision of 4 leads, on indefinite dimin- 
ishing of the sub-intervals, to a definite limit for sum (4), and this 
limit is the Riemann integral. It follows from these arguments that 
the Riemann and Lebesgue integrals are the same. 

Lebesgue showed that the necessary and sufficient condition for 
the existence of the Riemann integral is as follows: {(P) is bounded 
and the set of its points of discontinuity has Lebesgue measure 
zero [cf. 10]. As we have shown, such a function is also Lebesgue inte- 
grable. The coincidence of the Lebesgue and Riemann integrals can be 
proved precisely as above. Thus every function, Riemann integrable 
over a finite closed interval (in the proper sense), is Lebesgue integrable, 
the Lebesgue and Riemann integrals being the same. 


49. Properties of the integral. We give below the basic properties 
of the Lebesgue-Stieltjes integral. It is assumed in all these theorems 
that # is a measurable set of finite measure. 

1. If c is a constant, then 

{ cG (d8) = cG (2). (11) 
č 

The sums s, and S, have the value cG(@) for any subdivision ô, 
whence (11) follows [3]. 

2. If f,(P) and f,(P) are bounded and measurable on @, then 


f (P) + fe (P)1G (dF) = f A (P)G (dg) + fh (P)G(AZ). (12) 
x i z 
Let 6, and 6, be subdivisions, with which o,, for function f,(P) and 


6s; for fa(P) have the corresponding integrals as limits. With the sub- 
division ôn = ôn Ôn, the sums ca have the corresponding integrals as 


49] PROPERTIES OF THE INTEGRAL 131 


limits both for /,(P) and f,(P), and (12) follows from the theorem on 
the limit of a sum. In future, we shall not stipulate that the functions 
be bounded and measurable. 


3. f Sefe (P)G (A2) = Sor f fe(P)G AZ). (13) 
ë k=l k=l ¢ 


The fact that a constant factor can be taken outside the integral 
sign follows at once from the possibility of taking the factor outside 
the brackets in the sums øo, . In addition, property 2 has to be applied 
several times. 

4. If f(P) > 0 on @, then 


f f (P)G (da8) > 0. (14) 
é 
All the sums o, are non-negative. 
5. If f,(P) > f,(P), then 
PAIEYG Ae) > S (EIGA: (15) 


It is sufficient to apply 4 to f,(P) — 7,(P) and make use of 3. 
6. Lae e EOE: (16) 
4 ë 


Here, it is sufficient to take the product of subdivisions (8) for f and 
| f | and write the analogous inequality for sums ay. 
7. If a < f(P) <b on @, then 
aG (Z) < { /(P)G (dg) <bG(2) (17) 
č 
follows directly from 5 and 1. 
8. If | f(P) | < L, then 





eaz] <LG(@). (18) 


By hypothesis, —L < f(P) < +L, and (18) is a consequence of 7. 
9. If Z = Z’ + Z", where Z’ and @” are measurable and have no 
common a then 


Sh Í (P) G (d2) = J f (P)}G (da2) + f f(P)G (dé). (19) 
2 


This is proved simply by taking subdivision (8) for sets Z’ and #’, 
forming o; for these subdivisions and taking their sum. This last sum 
will have a definite limit, which proves (19). i 


132 SET FUNOTIONS AND THE LEBESGUE INTEGRAL [49 


10. If @ is split into a finite or denumerable number of measurable 
sets &;,, then 


ji Í (P)G (dg) = = fr f (P)G (ag). (20) 


The formula follows at once, for a finite number of terms, by re- 
peated application of property 9. We take the case of an infinite 
number of sets Zp. Let | f(P){ < L. We can write Z = Z, + Za + 
+... +n + En where R, = E — (1 + Za +... +n) is ob- 
viously a vanishing sequence of sets, so that G(R) —> 0 [37]. We 
obtain by applying the property for a finite number of terms: 


{F(P)@ (48) = 5 f j (P)0 (a2) )+ [APEA e 
& kal & 
We have for the last integral: 
eG?) < LG (Ra), 
Ra 





and (21) gives (20) in the limit, since G(R,) > 0. This property is 
generally known as the complete additivity of the integral. 
11, Given any £ > 0, there exists an n > 0 such that 


SFP) @ (dB) <e, (22) 


if ec & and Ge) < q. 
This property follows at once from the inequality 


| SiP) (a8) < Lae. 


It is usually known as the absolute continuity of the integral. 
12. If Z is a set of measure zero, i.e. G(F) = 0, we have 


Sf (P)G (dF) =0. 
č 


for any function bounded on @. 

The function /(P) is measurable on [42], and the sums s, and S, 
are zero for any subdivision. 

13. If f(P) and g(P) are equivalent on ô, then 


{ {(P)G (d&) = f g (P) G (d8). (23) 
€ € 


Let A be the part of where f # g. This set A is of measure zero, 
by hypothesis. Functions f(P) and g(P) coincide on the set 0’ = 


49] PROPERTIES OF THE INTEGRAL 133 
= & — A. We thus obtain the two equations: 

f F (P) G (ae) = TORE =0; 

A 


f f(P)G (as) = = 1 Ede). 
g 
addition of which gives (23). 
14. If f(P) > 0 on & and 


f f(P)G (dB) = 0, (24) 
E 


J(P) is equivalent to zero. We have to show that the measure of the 
set Z [f > 0] is zero. This set can be written as the sum of sets 


Z[>0]= Selis? =|: 
n=l 
and if its measure is positive, at least one of the component sets will 
be of positive measure. Let the measure say of the set B = @[f > 1/n,] 
be positive. We p the integral into two: 


Sf(P)e@ a f(P) G(d&) + J, F (P) G (dé). (25) 
é 


Since f > 0, the second term on the right is non-negative. We have 
f > 1/my on the set B, so that the first term is > (1/n,)G(B). Thus, 
since G(B) > 0, the left-hand side of (25) is positive, which contra- 
dicts (24). 

15. If f,(P) is a sequence of functions, measurable on & and uni- 
formly bounded, i.e. |f,(P)| < L, where L is a definite positive 
number (independent of n) and this sequence tends almost everywhere 
on @ to the limit function f(P), we have 

lim { fa (P) @ (d&) = f f (P) G (dé). (26) 
Noo ë č 

The limit function /(P) satisfies the inequality | f(P)| < Z almost 
everywhere on @. On passing to an equivalent function if necessary, 
we can assume that this inequality is satisfied everywhere on Z. The 
integral of a function /(P), measurable and bounded on @, has a 
meaning. 

We form the integral of {(P) — f,(P) and apply property 6: 


|} UEP SIENA) SE AA, (27) 
č č 


134 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [49 


Let £ be a ae positive number and @,, the set of all points of & 
at which | f(P) — fa(P) | > £. By ee 6 of T G(Zn) > 0. At 
points of the s (Z — Zn), we have | f(P) — f,(P) | < «. in a 
we have at any point P of Z: 


|F (P) — fa (P)| <|F(P)| + lf (P)| < 22. 
The integral on the right-hand side of (27) is split into two: 


f |F(P) — fn (P) |G (d&) = 
g 


= f [F (P) — fh (P)|G (aZ) + f |/(P)— fn (P)|G (de). 


En E ën 


It follows from this, by what was said above, that 
$ I (P) — fa (P) |G (AZ) < 2LG (En) + €G (8 —&,), 
(A 


or, all the more, 


{| f(P) — h (P) |G (42) < 2LG (%,) + eG (8). 
T 


Since G(Zn) > 0, it further follows that an N exists such that 
G(@n) < e for n > N, and therefore 


FIHP) —fa(P)|@(d#) <[2L4+E (Bye for n>N. 
č 


A a with - gives us 


|fe — fiat G(d%)| <[2L+G(&)Je for n>N, 
č 


whence (26) follows, since e is arbitrary. Notice that (26) can be proved 
simply on the assumption that | /,(P) | < Z almost everywhere on @. 
By passing to an equivalent function if necessary, we can assume 
that this inequality is satisfied everywhere on @. Our last property 
establishes the possibility of passing to the limit under the integral 
sign with the single assumption that /,(P) are bounded in absolute 
value, independently of the subscript n. We mentioned a similar pro- 
perty for the Stieltjes integral in [11]. It is a direct consequence of the 
theorem proved, since, when f,(P) and f(P) are continuous, the 
Lebesgue-Stieltjes integral reduces to the Stieltjes integral. Notice 
that, in the statement of property 15, we can replace the convergence 
of f,(P) to f(P) almost everywhere by convergence in measure. The 
proof remains the same in essence. 


50] THE INTEGRAL OF A NON-NEGATIVE UNBOUNDED FUNCTION 135 


16. If m < f(P) < M on the set f(P), the function 


g (y) =G (Z [m <f(P) <y)) 


is an increasing function of y, and the Lebesgue-Stieltjes integral 
reduces to the Stieltjes integral in accordance with the formula: 


M 
f f(P)@ (d2) = f ydg (y). (28) 
ë m 


This can be seen simply by noticing that the Lebesgue sums (8) are 
ordinary sums s; and S, for the Stieltjes integral appearing in the 
rigth-hand side of (28). 

In the case of the Lebesgue integral, i.e. when G(4) is the area of an 
interval, the integral is often written as follows: 


{ f f(z, y) dx dy. 


Similarly, the following notations are used for the Lebesgue integral 
in the cases of a straight line and three-dimensional space: 


f f (x) dy and f f(x, y, z2) dx dy dz. 
č g 


50. The integral of a non-negative unbounded function. We now 
define the integral when /(P) is an unbounded and non-negative 
measurable function on a measurable set & of finite measure. Here, 
we shall split Z into a denumerable, as well as finite, number of 
measurable subsets x. For the rest, the construction of the integral 
will be precisely the same as for a bounded function. Suppose we have 
some subdivision of 2: 


F= > Z, (29) 
k 
We form the sums s; and S, corresponding to it: 


8, = = m,G(E x); S, = 2 M,a (Za). (30) 


We have here infinite series with non-negative terms, and certain 
of the numbers m, and M; may be equal to (+20). If G(#;,) = 0 for 
some term, the corresponding term is reckoned equal to zero even in 
the case when the first factor m, or Mp is equal to (-+°°). The sums 
of series (30) do not depend on the order of the terms [I; 134]. Notice 
also that, if a finite subdivision ô is taken, at least one of the Mp in 


136 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [50 


the sum S, will be equal to +2 because f(P) is unbounded. The sums 
s and S, can take infinite values. As above, let ¢ be the strict upper 
bound of s, and J the strict lower bound of S;. These numbers may be 
equal to (+). We show that, as in the case of a bounded function, 
i = I. We divide the set as follows: we first extract from it the set 
8, on which f(P) = +-©, if there is such a set, whilst we divide the 
remainder of the set into sets Z, as follows: we subdivide the interval 
[0, +œ) by 0 = Yo < Yı < Yı < ... and form the sets 


2, = 2 [yo <} <v]; Zr =E [Yr <} < Yd. (31) 
We obviously have m, > Yk- Mk < Yr and 


(+ ©) G (Zo) + X Y1 0 (Br) < 8 < Sa < (+ œ) G (Zo) + 
k=1 
+ Sna) 6823 
k=l 
and all the more, 


(+ 20) G (Bo) + > y0 (E) <i <I < (4 00) GE) + 
k=l 
+ SnG). (83) 


k=l 


If G(f 4) > 0, obviously i = I = +2. Now suppose that G(Fy) = 0. 
Inequalities (32) and (33) can now be rewritten as 


> Y0 (Fr) < S < Sa < SG (Fx); (34) 
k=1 k=l 

D Yra (E) <i TS Sukar). (34) 
k=l k=l 


We shall assume that the subdivision of [0, œ) is such that the 
(Yk — Yr-1) (k= 1, 2,...) are bounded. Let 7 be their strict upper 
bound. On noticing that Yk < y,-, + 7, we can write 


> HG (Fr) < > Yer G (Fr) + 0 F(Z). (35) 
k=1 k=1 
Hence it follows immediately that, if the sum on the right of (34) 


is (+2) for some subdivision with finite 7, the same can be said of 
the sum on the left of (34). Now, by (34,), i= I = +œ. Conversely, 


50] THE INTEGRAL OF A NON-NEGATIVE UNBOUNDED FUNCTION 137 


if I = +, by (34,), the sum on the left of (35) is (4-2) for any sub- 
division with finite 7, so that the sum on the right of (35) is also 
(+22), and, by (34,), 7 = +œ. Hence, by (34), if S, is (+2) for some 
subdivision with finite 7, sẹ is also (+ °°), and s, and S, are now equal 
to (-+-°°) for any subdivision with finite 7. In these cases, i = I = +. 
We have in the case of finite sums, as in [49]: 


0 < SHG (Fi) — SYG (Fi) <n E), 
k=1 k=l 

and this difference tends to zero as ņ— 0, whence it follows that 

i = I. Both the Lebesgue sums (34), as also sums s and S, now tend 

to the value of the integral as 7 —> 0. The same can be said of the sums 


oy = Sf (Px) GB) (36) 
k=1 
for any choice of P, of Zq. In the case i = I = +2, sum (36) is ob- 
viously also equal to (+ co). If s4, S., and o, tend to the value of the 
integral (the case of a finite integral) and ôn > ôn, the same can be 
said of the sums sy, Sa and oy. 

The fundamental theorem of [49] can be carried over without 
change to the case of unbounded non-negative and measurable 
functions. A case of special importance is that when the integral is 
finite. In this case f(P) is said to be summable on the set Z. It follows 
from what has been said that a necessary condition for a function 
to be summable is that the set Z, = F[f = +] be of measure zero. 

Another definition, equivalent to the one above, can be given of 
the integral of an unbounded, non-negative and measurable function. 
We shall write [f(P)]x for the bounded non-negative function defined 
as follows: 

pee hetanary 
N, if f(P)> HN. 


The measurability of this function follows at once from the formula 
ZIU] >a] = [f > aj fora < N and @[[f]y >a] =A for a > N, 
where A is the empty set. We form the integrals 


in = f [fly G (dé). (38) 
č 


(37) 


They increase as N increases, and the integral of f(P) is defined as 
the limit of this monotonic variable (finite or infinite) as N — +. 
Let us show that this new definition of the integral is equivalent to the 


138 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [51 


previous one. Suppose first that the set Z, on which f{(P) = + has 
measure zero. The value of integral (38) is equal to the upper bound iy 
of the sums s$*? for the function [f]). These sums do not exceed the 
corresponding sums s; for f, so that in < i. We have to show that the 
monotonic variable iy has a limit i as N —> -4+-20. We use reductio ad 
absurdum. Let in —> i’ < i (hence 7’ is finite). We can take a sum ss 
for {(P) such that s; > i’. We retain in this sum a finite number of 
terms in such a way that s; > 7’ for the finite sum s; obtained: 


= mG (F,) >, (*) 


where the sum is finite, and the summation is over the remaining 
terms, as indicated by the prime on the summation sign. If m, = +, 
{(P) = +œ at every point of Er, ie. Fc Fy, so that G(F,) = 0, 
since G(Z ,) = 0 by hypothesis. As mentioned above, the corresponding 
term of the sum is taken equal to zero, and we can omit it. It can there- 
fore be assumed that all the m, are finite in the terms of the sum (*). 
We form the corresponding sum for [f]y: 


if) = a mG (2;) (mM) = = inf [Ax on Er). 


If the number JN is greater than all the m, appearing in sum (*) 
ae is afinite number of these numbas), mE )— = mand s^ = s > 
’, All the more, the complete sum s (N) for [f]y will be > 2’, so that 
= sup 3) > i’, which A the fact that iy tends to 7’ 
whilst increasing. Hence iy — i, and, with the second definition of the 
integral, its value turns out to be the same as with the first definition. 
If G(f,) > 0, the integral is (+°°) with the first definition. Let us 
show that the value is the same with the second definition. We have, 
since the functions [f]y are non-negative, 


in = Í [f] 6 (42) > f [fw G (42) = NG (Z), 
č &e 


because, by the definition, [f] = N at every point of & . It follows 
at once from the inequality iy > NG(%,) that iy > + as N —> +°, 
which is what we had to prove. 


51. Properties of the integral. By using the sums o,, we can prove 
certain properties of the integral of an unbounded non-negative 
function precisely as we did in [49]. We can also make use of the 
second definition of the integral in proving the properties. A further 
point: if f( P) is a bounded non-negative function, [f(P)]ẹ is the same 


51] PROPERTIES OF THE INTEGRAL 139 


as {(P) for sufficiently large NV, and the new definition of integral is 
the same as the previous one (of [48]). 

We turn to the proof of the properties of the integral. As in [49] 
we assume that @ is a measurable set of finite measure. 

1. If f,(P) (k = 1, 2, m) are summable functions, a linear com- 
bination of them with ai coefficients is a summable function, 
and (13) holds. 

The proof is the same as in [49]. 

2. If f(P) is summable on @, it is summable on any measurable part 
Z’ of Z. 

We have for [f(P)]y, by property 9 of [49] and the fact that it is 
non-negative: 


i [f(P)]v G (a2) < Í [AP] e 


On passing to the limit, we get 
f (P) G (dB) < fF (P) G (d8), (39) 
ë 7 


and if the right-hand side is finite, the left-hand side is all the more 
finite. 

3. If {(P) is summable on @ and the set @ is divided into a finite or 
denumerable number of measurable sets, (20) holds. 

We take the case of an infinite number of sets p. We have for a 
bounded function [f]y: 


§ Why @ (a8) = a [nv @ (a8), (40) 
whence it follows that 
X! [flu G (dé ast f(P)G (dé). (41) 
We obtain on indefinite increase of N: 
Si) 6a) < È fe G(aé). (42) 


Let us prove the reverse inequality. Since f(P) is non-negative, we 
ean write, by (40), for any given finite m: 


{ [flv@ (de) > X f [we ae). 
é k=l & 


140 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [51 


We obtain in the limit, on indefinite increase of N: 


[APGI > JiPe a2). 
d 


On now increasing m indefinitely, we arrive at the inequality 


SH(P)G(ds) > > f/(P)G (ag 
é k= dk 
which is the reverse of (42). 
4. If @ is divided into a finite or denumerable number of measurable 
sets č x, the function f(P) is summable on each &, and the series 


= SFP) G8) (43) 
=1 & 
with non-negative terms has a finite sum (is convergent), /(P) is 
summable on %, and (20) holds. 

We have (40) as above for the function [/]y, and also inequality (41), 
the right-hand side of which is a finite number. It follows at once from 
this inequality that integrals (38) have a finite limit, i.e. f(P) is sum- 
mable on @. After this, (20) follows from the summability. 

5. If f(P) is summable on @, given any £ > 0 there exists an 7 > 0 
such that 

f (P)G (dF) <e (44) 


when ec @ and Ge) < n 
We can fix an N such that 


[F-n] O> n) 


(A 


Now, by (39), we have for any ec Z: 
Ji [ls] e08) < ue. Jeca < Jne a2) + 


and we obtain the required inequality when G(e) < «/2N. 

The last two properties show that the integral of an unbounded non- 
negative function is completely additive and absolutely continuous, 
like the integral of a bounded function. 

6. If Z is a set of measure zero, the integral of }(P) is zero. The proof 
is the same as in [49]. 

7. Integrals of functions equivalent on & are equal. 


52] FUNCTIONS OF ANY SIGN 141 


8. If the integral of f(P) vanishes, the function is equivalent to zero. 
9. Iff,(P) < f,(P) on & and f,(P) is summable, f,(P) is also summable 
and we have 


J f (P) G (48) < J fı (P) G (dé). (45) 


10. If w(P) is a sequence of non-negative functions summable on & 
and 
fo, (P)G(d%) +0 as nc, 
é 


then w,(P)— 0 in measure on @. 

The proof of properties 7, 8 and 9 isthesame as in [49]. Inequality 
(45) can be written for [f,], and [f,]y, and we get (45) on passing to 
the limit as N —> œ. Property 10 is proved by reductio ad absurdum. 
If w,(P) does not > 0 in measure, there exists a 6 > 0 such that 
G(Z,) does not > 0, where En = @[w,(P) > ò]. Hence it follows that 
there exists a subsequence Zn, such that G(%,,) > d, where d is a 
positive number. We have 


Jom (P)G (dF) > f| wn (P) G (dF) > ôG (En) > ôd, 
ë 


éne 


whence it follows that the integral on the left does not —> 0, and this 
contradicts the hypothesis. 

We now turn to the definition of the Lebesgue—Stieltjes integral 
for an unbounded function that can change sign. The integral can 
be defined as above for negative (non-positive) functions. 


52. Functions of any sign. Let f(P) be a real measurable function 
given on a measurable set & of finite measure, and taking values of 
either sign. We introduce the so-called positive and negative parts 
of f(P): 


prepa fOr if {(P) > 0 
0, if /(P) <0; Ws 
oe me if f(P) <0 
0, if f(P)>0. 


This definition can be written alternatively as 


HP =SP FMP): AP) = IP P (46) 


142 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [52 


We can now write f(P) as the difference between two non-negative 
functions: 


Í (P) = ft (P) — F (P). 


DEFINITION. A function f(P) is said to be summable on & if ft (P) 
and f~-(P) are summable on &. The integral of f(P) is now given by 


$ f (P)G (a8) = f f+ (P) G (a8) — l F (P) G (dé). (47) 


Notice that, if only one of the functions f* (P) or f-(P) is summable, 
the last formula gives a definite though infinite value for the integral 
of f(P). For instance, if ft(P) is summable, whilst f-(P) is not, the 
integral of f(P) is equal to (—°). 

THEOREM. The necessary and sufficient condition for f(P) to be 
summable on & is that the non-negative function | f(P) | be summable 
onë. 

If {(P) is summable, f+ (P) and f-(P) are summable, so that their 
sum |/f(P)|=/ft(P) +/-(P) is also summable. Conversely, if the 
sum /*(P) + f-(P) is summable, each term is summable by property 
9 of [51], so that f(P) is also summable. Notice that the division 
into positive and negative parts can also be performed for a bounded 
function, and (47) holds for the integral. In future, we shall often 
use the term ‘“‘summable function” for a bounded measurable function. 
We now turn to the basic properties of the integral of a summable 
function of any sign. These properties follow almost immediately 
from the analogous properties of the integrals of the non-negative 
functions ft(P) and f-(P). 

L.If f,(P) (k= 1,2,...,p) are summable functions, a linear 
combination of them with constant coefficients is a summable function 
and (18) holds. 

The summability of a linear combination follows at once from the 
inequality 


| Sade (P) 
k=l 





< Sleallfe()|, 


the theorem proved above, and property 1 of [51]. To prove (13), 
we take separately the case of multiplying a function by a constant 
and the case of addition of two functions. Let f(P) besummable and c 
be a constant. We have to show that 


fof (P) G (d&) = c f f (P) G (de). 
ë 


é 


52] FUNCTIONS OF ANY SIGN 143 


We assume c negative for definiteness. We now have (cf)t = 
= —cf~ and (cf)- = —cft. Definition (47) gives us 


fof G(d#) = — ef f-G@ (dZ) +c {f+ G (d&) =e ffa (dB), 
ë E g ë 


and the formula is proved. We now suppose that /,(P) and f,(P) are 
summable functions. We have to show that 


Sl +h) G (d&) = [AG (d&) + ff, (da). (48) 
é é ë 


We split the functions f}, f, and f = fı + f} into positive and 
negative parts: 
A=A-h hsh; I5P-F. 
We have 
AtH+Pr=HR&+R +h. 
All the functions written are non-negative and summable. On apply- 
ing property 1 of [51], we get 


ft G (a8) + fit G (a8) + JP 612) = 


= (f G(d&) + [fy G (d&) + J f* G (dé), 
é ë € 
whence 
' f+ G (dé) =) f- G(d%) = 
= {ff @(av) -ffig (4B) + SHEE (d2) — Sf G (dz), 
é č 


which proves (48). 

2. If f,(P) and f,(P) are summable on @ and f,(P) > f,(P), (15) 
holds. 

By property 1, the non-negative function /,(P) — /,(P) is summable 
and the integral of it (it is non-negative) is equal to the difference 
between the integrals of f,(P) and fa(P), which proves (15). 

3. If f(P) is summable, (16) holds. 

Inequality (16) is equivalent to the following obvious inequality: 


Pia SE EA SU elo eae) 
ë č 


4. If f(P) is summable on @, it is summable on any measurable 
part 3’ of Z. 


144 SET FUNOTIONS AND THE LEBESGUE INTEGRAL [52 


5.If {(P) is summable on 2, and @ is divided into a finite or 
denumerable number of measurable sets Zx, (20) holds. 

These last two properties follow from the fact that they hold for 
f* (P) and f-(P) 

6. If Ž is divided into a finite or denumerable number of measurable 
sets Zr, (P) is summable on each &; and the series 


PRG (dz) (49) 
=l & 
is convergent, then f(P) is summable on & and (20) holds. 

The non-negative | /(P) | is summable on all the %,, and it follows 
from the convergence of series (49), by property 4 of [51], that 
| f(P) | is summable on Z, so that {(P) is also summable on @. After 
this, (20) follows from the last property. Notice that the convergence 
of series (43) is not sufficient for us to assert that /(P) is summable. 

7. If f(P) is summable on 2, given any e > 0 there exists an 7 > 0 
such that 

| 3 f (P) G(d@) | 


when e c & and G(e) < n. 

This property follows at once from the fact that it holds for f+ (P) 
and f-(P) 

We have thus proved that the integral of any summable function 
is completely additive and absolutely continuous. 

8. If & is a set of measure zero, the integral of any f(P) over & 
is zero. 

9. The integrals of functions equivalent on & are equal. 

Both properties are the result of the analogous properties for ft (P) 
and {-(P); we have to notice here that, if two functions are equivalent, 
their positive and negative parts are equivalent. 

10. If f(P) is measurable on @, F(P) is measurable, non-negative 
and summable on @ and | f(P)| < F(P), f(P) is summable, and we 
have 


|f f(P)G (de)| < f F (P)@ (a2). (50) 
E ë 


We can say, by property 9 of [51], that | /(P)| is summable, so 
that f(P) is also summable. Inequality (50) follows at once from 
property 3 and property 9 of [51]. An immediate consequence of 
what has been proved is that the product of a summable function 
and bounded measurable function is summable. 


52] FUNCTIONS OF ANY SIGN 145 


We must mention two further properties of the integral, which will 
be useful later. 

11. If f(P) is summable on a finite interval 4, and the integral 
of it over any interval A belonging to A, is zero, f(P) is equivalent 
to zero on 4o. 

We use reductio ad absurdum. If f(P) is not equivalent to zero, 
there exists a positive a such that one of the sets 4, [f(P) > a] or 
4, [f(P) < —a] has measure greater than zero. Let this be the 
first set, and let us write it as Z. We have 


f (P)G(d&) > aG (&) > 0. 
A 


But there exist sets e, and e, of as small measure as desired such 
that Z + e, = R + e, whero ÈR is an elementary figure, i.e. a finite 
sum of intervals with no common points. By hypothesis, the integral 
of {(P) over R must be zero, and we can write 


f E(P) G (d2) = f (P) G (d2) — ff (P) G (dB). 
č e: e 


Since the integral is absolutely continuous, the right-hand side 
can be made as small as desired in absolute value, whilst the left 
keeps a definite positive value. We have arrived at an absurdity, 
and our assertion is proved. 

12. If {(P) is summable on @ and satisfies the condition 


Se(P) f(P)G (de) =0 (51) 
€ 


for any choice of g(P), measurable and bounded on @, f(P) is equi- 
valent to zero. If (51) is satisfied for any choice of g(P), measurable 
and bounded on @ and such that 


Sp (P)G (dé) =0, (52) 
č 


}(P) is equivalent to a constant. 

Let 2’ be the part of Z where f(P) > 0. We take as ọ(P) a function 
equal to unity on 2’ and zero on & — g’. Condition (51) shows 
that the integral of f* (P) over & is zero, so that, by property 8 of 
[52], ft(P) must be equivalent to zero. Similarly, we can show that 
/-(P) is equivalent to zero, so that f(P) is equivalent to zero. 

We turn to the proof of the second part of the assertion. On writing 
kG(@) for the integral of f(P) over Z, by (52), {(P) — k also satisfies 


146 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [53 


condition (51), i.e 
§ p(P) [F (P) — k]G (az) = 0. 
€ 


Moreover, by the definition of k, we have for any choice of con- 
stant c: 


fe[f(P) — k] G (dé) =0. (53) 
č 


Let y(P) be any measurable function bounded on @, and cg(@) 
the value of its integral. The function ọ(P) = (P) —c satisfies 
condition (52), and we have 


{[vp(P) — c] [/ (P) — k] G (az) =0 
č 


or, by (53): 
fy (P) [/ (P) — k] G (az) = 0, 
é 


whence it follows, by what has been proved above, that f(P) — k 
is equivalent to zero, i.e. {(P) is equivalent to k. 

We conclude this section by considering the Lebesgue—Stieltjes 
integral for a single variable. Let g(x) be a non-decreasing function, 
which is at the basis of the measurement, and f(x) be measurable 
with respect to g(x) and summable on the measurable set 2 of the 
X axis or on the interval [a, b] or on (a, a etc.: 


Jei x) dg (2) on J jagia), reJ (x) ete. 


If g(x) = x, we a es Lebesgue ea In this case the measure 
of any point is zero, and it is of no importance whether or not the 
ends are associated with an interval; the integral over an interval is 
here usually written as 


b 
Sf (x) dx. 


We are assuming a < b. In addition, the following convention is 
adopted: 


a b 
§ f(x) dx = — f f (x) dex 
b a 


53. Complex summable functions. It is easy to define a summable 
function, and an integral, for a function f(P) that takes complex 
values. We split the function into real and imaginary parts: 


FP) =f (P) + th (P) 


54] PASSAGE TO THE LIMIT UNDER THE INTEGRAL SIGN 147 


We say that f(P) is summable if /,(P) and f,(P) are summable, 
and the integral of /(P) is defined as 


f (P)G (dF) = Sf, (P)G (dF) +iff,(P)G(d&). (54) 
č ¢ č 


The theorem proved above holds in this case: the necessary and 
sufficient condition for f(P} to be summable is that the modulus 
| /(P) | be summable. 

We notice first of all that, since f,(P) and f,(P) are measurable, 
the sum of their squares is measurable, so that the arithmetic value 
of the square root of this sum is measurable, i.e. | f| = Y(R + f3) is 
measurable, as follows at once from the formula 


Z [VA + É >a] =Z [A + R> a]. 


Further, it follows from the inequalities: 


InI<SVA+E lAl<VA+h VA+AS<IAI+IAI 


and properties 9 and 1 of [51] that the summability of | f} | and | f, | 
is equivalent to the summability of | f |, whence follows the assertion 
that we made above. 

Further, properties 1, 3, 4, 5, 6, 7, 8, 9 and 10 above still hold, and 
complex constant coefficients e, can be used when forming a linear 
combination of the f,(P). We shall only dwell on the proof of 
property 3: 

|{f@(d&)| < f|f|G (42). (55) 
(A č 


The functions /,, fz and V(} + f) are summable, so that the sums 
of, a, of), corresponding to the sequences of Lebesgue sub- 
divisions of these functions, tend to the respective integrals. If we 
take the sequence of subdivisions 6, = 6%) 6 6®, the sums o, for 
fi» f and )(f? + f2) will all the more tend to the respective integrals. 
If ôn is a subdivision of Z into F%, and P” are any points of the 
g0, we have 


| 2 [A (PE) + th, (PP )] G (FP) | < alh (PY?) + tf, (PP) |G (FP), 


and (55) is obtained in the limit. 


54. Passage to the limit under the integral sign. Let us prove some 
theorems on passage to the limit under the integral sign for summable 
functions. 


148 SET FUNCTIONS AND THE LEBESGUE INTEGRAL (54 


THEOREM 1. If /,(P) is a sequence of functions, summable on a set & 
of finite measure, where 


|h (P) |< F (P) (56) 


for all these functions, F(P) is summable on @, and fn( P) —> {(P) almost 
everywhere on Z, then f(P) is summable on Z, and 


lim ffa (P)G (42) = f (P) G (dé). (57) 


n= & 


It follows from the hypotheses that the limit function satisfies 
almost everywhere on @ the inequality 


|Z (P)| < F (P). (56;) 


By passing to the equivalent function, we can assume that this 
inequality is satisfied everywhere on &. By property 10 of [52], 
fa(P) and f(P) are summable on @ and hence have ras values 
almost everywhere on %. We take the integral of f(P) — f,(P) and 
apply property 10 of [51]: 


Mirae EMG (ae SSE tl) AR): 188) 
Let ¢ be a given positive number, and &, the sets belonging to & 
about which we spoke in Theorem 6 of [44]. We have, by this theorem, 
G(&,)+0 and |f(P)—f,(P)| <e, if PEZ — 8n. (59) 
In addition, we have at any point P of @: 
IZ (P) — fa (P)| < |F (P)| + [fa (P)| < 2F (P). (60) 
We split the integral on the right-hand side of (58) into two: 
J IFP) — fn (P) |G (d2) = 
Í = SIP) fa(P) C108) + $ IFP) — fa (PIG AB). 


Hence, by (59) and (60): 
JI (P) — fa (P) |G (d8) < 2 f F (P)G (d8) + £G (2 — Zn). 
ë en 


or all the more 


S{f(P) —fa(P) |G (42) < 2 f F (P)G (de) + eG (8). (61) 
ë en 


54] PASSAGE TO THE LIMIT UNDER THE INTEGRAL SIGN 149 


Since G(Zn) — 0 and the integral of F(P) is absolutely continuous, 
it follows that there exists an N such that 


poe G(dZ@) <e for n>N. 


and, by (61): 
SIZ (P) — fn (P) |G (dF) < [24 G(#)]e for n>N. 
¢ 


Comparison with (58) gives us 


Hii G (dg) — SELS [2 + G (2)]e, 


whence (57) follows, since e is arbitrary. As when proving property 
15 of [49], it is sufficient to assume that inequality (56) is satisfied 
almost everywhere on @. 

Note. It may easily be seen that we have only used the con- 
vergence in measure /,(P)— /(P), so that this convergence almost 
everywhere can be replaced in the statement of the theorem by 
convergence in measure. 

THEOREM 2. If f,(P) is a non-decreasing sequence of functions, 
summable on a set & of finite measure, the integral over & of the limit 
function f(P) is finite or (+2), and (57) holds. 

The summable functions f,(P) are finite almost everywhere on @, 
and the non-decreasing sequence f,(P) has a limit at every point, 
which may be equal to (+°°). Let us consider the non-decreasing 
sequence of functions /,(P) — f,(P). We obviously have 


0 < fn (P) — J (P) < f (P) — fı (P) 


If the non-negative function /(P) — f,(P) is summable over @, 
{(P) is also summable. The difference /(P) — /,(P) can play the role 
of F(P) in Theorem 1, and application of this theorem gives 

lim §[f,(P) — fı (P)] G (42) = f [F (P) — fı (P)] G 2). 
n>æ č č 

Addition of the integral of f,(P) to both sides gives (57). Now let 
the integral of /(P) — f,(P) be (+2). Now, since /,(P) is summable, 
the integral of {(P) is also (+c). Notice further that, if the sequence 
gn(P) tends to (P) almost everywhere, [gn(P)]n — [9(P)]n almost 
everywhere for any N. 

This can be proved simply by remarking that, if p,(P)—> o(P) at 
some point, [y,(P)]~ > [p(P})]n at this point. This is easily seen 


150 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [54 


if we consider separately the cases p(P) < N and ¢(P) > N. Thus 
the non-decreasing sequence of non-negative functions [f/,(P) — 
— f,(P)]n tends almost everywhere to [f(P) —/,(P)]y. The limit 
function is bounded and all the more summable. By what has been 
proved: 
lim $ [fa(P) — f (Pijn G (42) = f [fF (P) — f (P)]wG (dF). (62) 
Nr ë č 
Let K be any given positive number. Since the integral of [/(P) — 
— f,(P)] is equal to (+ °°), we can fix an N such that the integral 
on the right-hand side of (62) is greater than K. Hence we have, 
by (62), for all sufficiently large n: 


S [fn (P) — fi (P)|n G (d2) > K, 
č 


and all the more: 


Ufa (P) — fy (P)] G (a2) >K. 
č 


Since K is arbitrary, it follows that 
lim |f f, (P) G (d&) — § f, (P) G(d&)] = + ©, 
Nao č é 


i.e. 
lim ff, (P) @ (dB) = + o9, 
n> ë 
adn (57) is proved for the case when the integral of f(P) is (+e). 
Note. A similar theorem holds for a decreasing sequence of 
summable functions, except that the integral of the limit function 
may be (—°°) instead of (+2). If /,(P) is a decreasing sequence, 
we obtain an increasing sequence on putting 9, = —/,, and the 
minus sign can be taken outside the integral. 
The following corollary of the above theorem is important for 
what follows. 
THEOREM 3. If the functions u,(P) (k = 1, 2, 3, ...) are non-negative 
and summable on @, and the series with non-negative terms: 


Š §m(P)G (dz) (63) 
k=1¢ 
ts convergent, the series 
Su (P) (64) 


k=1 


54] PASSAGE TO THE LIMIT UNDER THE INTEGRAL SIGN 151 


is convergent almost everywhere on E, and u,(P)—> 0 almost everywhere 
on Ë. 

We consider the non-decreasing sequence of non-negative functions, 
summable on @: 


h (P) = Su, (P) 
k=1 


and apply the previous theorem to this sequence. Since series (63) 
is convergent, the integrals of the /,(P) have a finite limit as n increases 
indefinitely. Thus the limit function, here expressed by series (64): 


(P) = S%(P) 
k=1 


is summable on @, and hence has finite values almost everywhere 
on @, i.e. series (64) is in fact convergent almost everywhere on 2. 
But the terms of a convergent series tend to zero on moving away 
indefinitely from the initial term, i.e. u,(P)—> 0 almost everywhere 
on @, and the theorem is fully proved. 

THEOREM 4. If f,(P) ts a sequence of non-negative functions, summable 
on &, that tends almost everywhere on E to a limit function f(P), and 
the integrals of the f,(P) do not exceed some number A for any n, ie. 


ffn (P)G (dB) < A, 
& 


then f(P) is summable on Z, and we have 
Sf(P)G (dé) < A. (65) 
ë 


We have the inequality: 
S [tlw @ (dB) < ffa (de) < A (66) 
è é 


and, by property 15 of [49], with the number N playing the role of L, 
we can write 


lim § [falw @ (48) = f [f]w @ (a8). 


fo ë č 
On passing to the limit as n —> œ in inequality (66), we get 
S [flv G (42) < A, 
č 


whence it follows that f(P) is summable, and (65) is obtained as 
N —> co, 


152 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [55 


55. The class L,. We shall consider in the present section a class 
of measurable functions. This class plays an important part in 
applications of our present theory to various problems of mathematics 
and mathematical physics. 

DEFINITION. A real function f(P), measurable on a measurable set € 
of finite measure, ts said to be square summable on @ if tts square f?(P) 
is summable on @, i.e. if 


SP(P)G dS) < + œ. 
č 


The class of functions square summable on @ is written symbolically 
as LĪ. The symbol L, is used for the case of the Lebesgue integral, 
i.e. when G(4) is the area of the interval 4. We shall in future simply 
write L, instead of LẸ for the sake of simplicity. But it must be borne 
in mind that everything said below holds for any choice of G(4). 
We now prove a number of properties of the class L}. 

THEOREM 1. If {(P) and g(P) € L,, f(P) and f(P) g(P) are summable 
on g. 


These assertions follow at once from the inequalities 
1 1 ; 
FISa tP l< -Pt 9) 


and properties 1 and 10 of [52]. It must be mentioned that real 
functions are understood here and throughout what follows. 
THEOREM 2. If f(P) and g(P) € L, cf(P) and f(P) + g(P) also 
belong to Ly. 
The statement regarding cf(P) is obvious, whilst it follows for 
J(P) + g(P) from the formula 
+g =P + 2f9+ 9? 
together with Theorem 1 and property 1 of [52]. 
THEOREM 3. If f and g € Ly, the (Buniakowski-Schwartz) inequality 
holds: 
[f fg G (a2) ]? < ff G (d2) f g9? G (de). (67) 
é 


č ë 


The proof is precisely the same as for the Riemann integral. We shall 
repeat it. It must be remarked first of all that, if the coefficients 
are real and a > 0 in the quadratic form au? + 2bu + c, the identity 


au? + 2bu + c = + [(au + b)? + (ac — b)] 


implies that 6? < ac if our quadratic form has non-negative values 


55] THE CLASS Ly 153 


for all real wu. We assume that f and g are not equivalent to zero, 
since otherwise (67) is trivial due to its left-hand side being zero. 
We write the obvious equation 


f (fu + 9)? (dB) = u? f G (d2) + 2u f fy G (dB) + [2 GAB), 

č č é é 

where u is a parameter. The integral on the left has a non-negative 

value for any real u. Hence the quadratic form on the right also has 

this property. We must have b? < ac for this quadratic form, which 

leads to inequality (67). Notice that the coefficient a in the quadratic 

form is certainly positive, since f is not equivalent to zero. 
COROLLARY. If f € L, obviously |f] € L, and by writing |f | as 

IfI =]|f|-1, we get the inequality: 





| {/G(da2)| < [| f|G (a) < V (PC (48)-G (8). (68) 
g ë 


ë 


THEOREM 4. If f and g € L,, we have 
Vitor G (ae) < Y fG dE) + Y f9 G (dé). (69) 
e č & 


We obtain from (67): 
ftg G (dB) < V SPE dz) Y f gG (de). 
é č č 


We multiply both sides of this inequality by two and add the integral 
of f? and the integral of g? to both sides of the inequality obtained. 
The resulting inequality: 


SPG (AF) + 2 f {9G (d8) + [PE (da8) < 
A ë č 





< [PG (a8) + 2V SPOR) VF xO dB) + f gG (d2) 
€ č č ¢ 
can be written as 


S(if+9PG@ (dé) < [V [PE de) + Y [9G (de)p, 
č A ¢ 





which leads us directly to (69). 
Notice further that, if {(P) € Lọ {(P) takes finite values almost 
everywhere on @, due to the summability of /*(P). 


154 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [56 


56. Convergence in the mean. We now introduce a new type of 
convergence in class L. 

Derrinirion. We say that a sequence of functions f,(P) of L, is 
convergent in the mean to a function f(P) of L, or simply that it is con- 
vergent in L, to f(P) if 

lim S[f(P) — fn (P) G (dF) = 0. (70) 
n=ċo ë 

It must be noticed first of all that, if we replace {(P) by an equi- 
valent function g(P), the integral in (70) is unchanged, and g(P) is 
also a limit in the mean of the /,(P). In future, we shall regard equi- 
valent functions of L, as the same function. We now prove that the 
limit is unique, i.e. that the following theorem holds. 

THEOREM 5. If a sequence f,(P) of L, is convergent in L,to two functions 
{(P) and g(P), these functions are equivalent. 

We write the obvious equation 


f-—g9=(f— fn) + (fa — 9). 
and apply inequality (69) to the right-hand side: 


Yf 9)? G dB) < V SF —F,)° E48) + V Sg fn)? E(B). 
ë č é 


As n—» œœ, the right-hand side tends to zero, and the left is inde- 

pendent of n, i.e. we have 
S(f —9) G (de) = 0. 
é 

It follows from this, by property 8 of [52], that f — g is equivalent 
to zero, so that f and g are equivalent. This theorem establishes the 
uniqueness of the limit of the f,(P) in L, though obviously, not 
every sequence of functions has a limit in the mean. It should be 
noticed that convergence almost everywhere does not imply conver- 
gence in the mean, and convergence in the mean does not imply 
convergence almost everywhere. We prove the following theorem in 
connection with this. 

THEOREM 6. If a sequence fn(P) of L, tends to {(P) in the mean on @, 
we can extract a subsequence fn (P), which is convergent almost everywhere 
to f(P) on @. 

It follows from (70), by property 10 of [51], that f,(P)— f(P) 
in measure on &, and Theorem 6 is a consequence of Theorem 7 
of [44]. 

COROLLARY. If f,(P) tend to f(P) in the mean and tend to ¢(P) almost 
everywhere on E, then g(P) and f(P) are equivalent on Z. If fP) > (P) 


56] CONVERGENCE IN THE MEAN 155 


almost everywhere, then all the more fn,(P) —> (P) almost everywhere. 
But, as we have seen, fn(P)— f(P) almost everywhere, whence it fol- 
lows that g(P) and f(P) are equivalent. 

A necessary and sufficient condition can be established for con- 
vergence in the mean, analogous to the Cauchy condition for the 
existence of a limit of a numerical sequence [I; 36]. As a preliminary, 
we introduce a new definition. 

DEFINITION. We say that a sequence f,(P) of functions of L, is 
mutually convergent in the mean if, given any positive £, there exists 
an N such that 


S (fn — fm)? G (dF) < € for n and m >N. (71) 
č 
THEOREM 7. The necessary condition for a sequence fn(P) to be con- 
vergent in the mean to some function of L, is that it be mutually con- 
vergent in the mean. 


We are given that the sequence is convergent in the mean to a 
function /(P). We write f,(P) — fm(P) as 


fn — fm = (fa —f) + (F — fm)- 


and apply inequality (69): 





VS Gn — fm)? OAE) < VF (f HG (dF) + V S E — fn)? G (dB). 
(d ¢ č 


Let « be a given positive number. In view of the convergence in 
the mean to f(P), there exists an N such that, for n and m > N, 
the integrals under the square roots signs on the right-hand side 
are < ¢/4. Our inequality now leads at once to (71), and the theorem 
is proved. We now prove the converse. 

THEOREM 8. A sufficient condition for a sequence f,(P) to be con- 
vergent in the mean to some function is that it be mutually convergent 
in the mean. 

Given that /,(P) is mutually convergent, we have to show that it 
is convergent in the mean to some function. Since it is mutually 
convergent, there exists an increasing sequence of subscripts ny < 
< My <L Nng < ... such that 


| [frag (P) — fn (PI? (dE) < pe - 
ë 


156 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [56 


On applying inequality (67) with f = | fn, — fn, | and g = 1, we 
get 


Sess (P) T fn (P) | G (dz) < 
ë 
< Y $ Fna (P) — fu (PPG (dF) y fG (dé), 
ë ë 





or, by the previous inequality: 
fl foau (P) — fu (P)| 0 (08) < oe VOR). 
ë 
Hence follows the convergence of the series 
Š J lim (P) — fn (P)10 (88), 
and, by Theorem 3 of [54], the series 
Š lfm (P) — fa (P) 
is convergent almost everywhere on @. All the more, the series 
fn (P) + [run (P) — fra ( PI] 


is convergent almost everywhere, the sum of the first p terms being 
equal to fn,(P), i.e. the sequence 


fn (P), Ín (P) fn (P) ee 


is convergent almost everywhere on @ to a function f(P) with finite 
values. Let us show that f(P) € L, and that f,(P) is convergent in 
the mean to f(P). Since the sequence /,(P) is mutually convergent, 
given any positive £, there exists an N such that 


S [fre (P) — fa (P) G (dg) < e for ny and n>N. 
We let n; tend to infinity in this inequality and use Theorem 4 of 
{54], which gives us 
S [F (P) — fr (P) G (dF) < £ for n>N. (72) 
ë 


Hence it follows that f(P) — fn(P) € L}. But fa(P) also belongs to 
L, On adding f(P) — fnh(P) and f,(P), we find, by Theorem 2, that 
/(P) € L,. Inequality (72) shows finally that f,(P) tends to f(P) in 


67] HILBERT FUNCTION SPACE 157 


the mean. The last two theorems lead to the following: the necessary 
and sufficient condition for the sequence /,(P) to be mutually con- 
vergent in the mean is that the sequence be convergent in the mean 
to some function. 

THEOREM 9. If f,(P) and gn(P) € Ly and f,(P)—> f(P), ga(P) > 9(P) 
in the mean, then 


fn (P) 9n(P) G (d&) > f f (P) g (P) G (dz). 
č č 


On using the notation for any two functions y(P) and y(P) of L, 
[IV; 35]: 
(p, ») = |p (P) y (P) G (dé), 
é 


we can write the Buniakowski inequality as 
(p, Y)? < (p, p) (Y, Y). 
We now put 
Ín (P) — (P) = pn (P); gn (P) — g (P) = Yn (P). 
By hypothesis, (Pn, pn) and (Yn, Yn) > 0. We form the difference 
D) — (far $n) = (69) — (f+ Pn 9 + Pa) = 
= — (f, Yn) — (Pr 9) — (Pr Pn) 


whence 
(ED — fas In) | <I Yd) + Png) | + Gn Pa) | < 
< VAD Vn Pn) +V Gn Pr) V99) +V Cnr Pa) V Yn Pn) - 


The right-hand side tends to zero as n—> œ, whence | (/,g) — 
— (fns Gn) | —> 0, ie. (fns Gn) —> (f, g), which is what we set out to prove. 


57. Hilbert function space. Like the family C of [14], the family 
of functions of Z, forms a function space. An element of this space 
is a real function, square summable on 2. Equivalent functions are 
identified here, i.e. they correspond to the same element of L,. 

Addition of elements and multiplication by a real number can be 
defined, the operations being subject to the ordinary laws of algebra. 
The norm of an element v (the length of a vector) is defined as the 
non-negative number given by 


lf (P) {l= Y [P] e a8). (73) 


158 SET FUNCTIONS AND THE LEBFSGUE INTEGRAL {57 


We say that a sequence of elements /,(P) of L, is convergent to 
an element f(P) of L, if || f(P) — fh(P) || — 0 as n—> œ. By (73), 
this convergence in norm is equivalent to convergence in the mean. 

The scalar product of two elements /(P) and g(P) can also be defined. 
It is given by 


(£9) = f fg G (dé), (74) 
and we obviously have i 
IH= YAA. (75) 
The distance between two elements f and g is given by 


elf =llf-—gll =VIF—92 GF) =V (if—g.f—g)- (76) 


Given three elements f, gand h, we can write f — h = (f —g) + (g— h) 
and apply (69). We thus obtain, using definition (74), the so-called 
triangle rule: 

e(f,h) <e(f.9)+e(9,A). (77) 


The zero or zero element of the space is defined as the function 
identically equal to zero on @, or, what amounts to the same thing, 
the function equivalent to zero. The norm of the zero element is 
zero, whilst the norm of any other element is positive, by property 8 
of [51]. The distance o(f,g) > 0, and the equals sign only holds 
when the elements are the same, i.e. f and g are equivalent functions. 
The distance and scalar product are symmetrical, i.e. o(g, f) = elf, g) 
and (g, f) = (f,g). The necessary and sufficient condition for a 
sequence of elements to have a limit in our functional space is that 
it be mutually convergent, i.e. given any e, there exists an N such 
that || fm — fn || < e for n and m > N. This last property is usually 
described by saying that the space L, is complete. 

Inequality (77) obviously holds for any finite number of terms: 


0 (fis fm) < @ (Fo fa) + @ (fo fa) + --- + o (fmi fm) 
or 


lA — fall <Il — fall + Ufa -fall + +++ d Imi fm lle (17) 


An essentially similar space L, can be formed for the complex 
functions (54). The function f(P) = f(P) + @f,(P) is said to belong 


57] HILBERT FUNCTION SPACE 159 


to L, if f (P) and f,(P) belong to L,. The square of the modulus | f(P) |? 
is a summable function. Theorems 1 and 2 are retained. Inequalities 
(67) and (69) are rewritten as 


| Soda < < JIPO (a8) S1al @ (a3) 
I ae | pif, Ree ee (78) 
V SIF Fop eaa < y firea (8) + V Slo Pa (dg). 


In the definitions of convergence in the mean and mutual con- 
vergence (f — fn)? and (fn — fm)? have to be replaced by |f — fn |? 
and | fn — fm |®. Theorems 7 and 8 are retained. Multiplication by a 
complex as well as a real number is permissible when forming the 
functional space. The norm of an element is given by 


WF=V SFR Ee), (79) 
ë 
and the scalar product by 
= f fG (dz), (80) 
é 


where a denotes as usual the complex conjugate of a. Formula (77) 
holds as before. The distance between two elements is given by (76) 
with (f — g}? replaced by |f —g|?, and it has the same properties 
as in a real space. We have for the scalar product: (g, f) = (f, g). 
Everything said about the complex function space follows at once 
from the fact that functions f,(P) and f,(P) belong to the real space L. 
The functional space L, is often called a Hilbert functional space. 

A particular case must be mentioned. Suppose that the function 
GE) a to concentrated masses located at the points 
P,, P,, --. Pm and equal to unity. In this case the Lebesgue- 
Stieltjes iega of any function /(P) taking finite values at the above 
points over any set 2 containing the points degenerates to a finite 
sum: 


[FP Gs) = Sf (Py. 
č k=1 


If we regard the values of any f(P) at the P, (k = 1, ...,m) as 
the components of an m-dimensional complex vector, we obtain an 
m-dimensional space Fm, the theory of which was discussed in volume 
II] [II]; 25]. The definitions just given of addition, multiplication by a 
number, norm, scalar product etc. are the same as those discussed 
earlier. 


160 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [58 


58. Orthogonal systems of functions. The theory of orthogonal 
systems of functions is directly connected with the functional space 
L,. This theory has already been described [IV; 38, 80]. We shall 
supplement the previous treatment by bringing in Lebesgue- 
Stieltjes or Lebesgue integrals. We shall start by discussing real 
functions. 

DEFINITION. We say that the functions 


pı (P) PPh- (81) 


given on a measurable set E of finite measure and belonging to L,, form 
an orthogonal and normed [or orthonormal] system if we have 


0 for k¢l, 
P) G (dg) = 82 
Joel? )+ 9 (P) G (dé) f E, (82) 


Given any function f(P) of Z,, we can form its Fourier coefficients 
with respect to system (81): 


a, = Í f (P) pn (P) G (de), (83) 
č 
and its Fourier series: 
D an Pn (P). (84) 
n=1 


We cannot say anything about the convergence of this series, but 
we can form the segment of the series: 


= D> % Gx (P)- (85) 
k=1 
The expression 
S (P) — X bn Oe (P)]? G (dz) (86) 
č k=1 


has a least value if the coefficients b, are taken equal to the Fourier - 
coefficients ay. In this case we get the following simple formula for 
expression (86): 


FEP) — 8, (PP @ (a8) = ( F(P)G (dB) — Sak, (87) 
č k=1 


€ 


from which Bessel’s inequality follows: 


su< SPP) Gide) (88) 


58] ORTHOGONAL SYSTEMS OF FUNCTIONS 161 


and the convergence of the series on the left-hand side of this inequality. 
If the = sign holds in (88), the resulting formula: 


f P(P)G (dB) = Sai (89) 
č k=l 
is called the closure equation. By (87), the closure equation is equi- 
valent to the fact that the segments of the Fourier series S,(/) tend 
in the mean to f(P). We now prove the following fundamental theorem. 
THEOREM 1 (Riesz—Fischer). If c, is any given seguence of real 
numbers, the squares of which form a convergent series: 


Sa<te, (90) 


n=l 
there exists a unique function of L, for which the cn are the Fourier 
coefficients with respect to system (81) and for which the closure equation 


(89) holds. 
We form the functions 


Sa (P) = Sexge(P)- (91) 
k=1 


Since system (81) is orthonormal], we have 


J [Sa (P) — Sp (P)P@ (dF) = cti + rat +--+ (>p) 
ë 


and the convergence of series (90) implies that the right-hand side 
of the last formula tends to zero on indefinite increase of p, i.e. the 
sequence of functions (91) of L, is mutually convergent. Thus a 
function f(P) of L, exists, to which S,(P) is convergent in the mean: 
lim f [f (P) — S, (P)? G (da2) = 0. (92) 
n= & 
Let us show that the c, are the Fourier coefficients a, of this function. 
On taking (83) into account, as also that system (81) is orthonormal, 
we can write 


S [ (P) — Sn (P) G (dF) = 
¢ 


n 
=> 


Ma 


= P(P)G (as) — Sai] + > (o —a)?- (93) 


~ 
{I 
= 


k= k 


The difference in square brackets on the right-hand side is non- 
negative by Bessel’s inequality. The remaining terms on the right- 


162 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [58 


hand side are also non-negative. As n —> ©, the left-hand side tends 
to zero, so that the same can be said of the right-hand side. It follows 
from this that each of the non-negative terms (cx — ax}? is equal to 
zero, i.e. Cg = dx, which is what we set out to prove. Thus functions 
(91) are segments of the Fourier series of /(P) and it follows from 
(92) that the closure equation holds for f(P). It remains to show that 
the /(P) with the properties mentioned is unique. If a function g(P) 
exists with these properties, (91) are segments of the Fourier series 
both for f(P) and for g(P), i.e. the sequence S,(P) tends in the mean 
both to f(P) and to g(P). Since the limit in Z, is unique, this means 
that {(P) and g(P) are equivalent, i.e. they represent the same element 
of L,, and the theorem is fully proved. We now define a closed system. 

DEFINITION. An orthonormal system (81) is said to be closed if the 
closure equation (89) holds for any function f(P) of Ly. 

We have not assumed in the proof of Theorem 1 that system (81) 
is closed. If this is the case, it is not necessary to stipulate that the 
closure equation holds for the function, since it holds for any function 
of L, by the definition of a closed system. Thus Theorem 1 is stated 
as follows for a closed system. À 

THEOREM 1. If system (81) is closed and cn is any given sequence 
of real numbers, for which series (90) is convergent, there exists a unique 
function of L, for which the cn are the Fourier coefficients. 

In addition to a closed system, we can define a complete system. 

DEFINITION. System (81) is said to be complete, if there exists no 
function in L, not zero (i.e. not equivalent to zero) and orthogonal to 
all the px(P). 

We now show that the concepts of closure and completeness are 
equivalent. 

THEOREM 2. The necessary and sufficient condition for system (81) 
to be complete is that tt be closed. 

We use reductio ad absurdum for the necessity. Let system (81) 
be complete and non-closed, i.e. there exists a function A(P) of L, 
with Fourier coefficients a; such that 


{2 (P)@ (dB) > Sat. (94) 
č k=1 


On the other hand, by Theorem 1, an f(P) of L, exists, with the 
same Fourier coefficients ag, for which the closure equation (89) holds. 
On comparing this formula with (94), we get 


f h (P) G (dB) > f? (P) G (de). (95) 
€ E 


58) ORTHOGONAL SYSTEMS OF FUNCTIONS 163 


But f(P) — A(P) has Fourier coefficients which are all zero, i.e. 
it is orthogonal to all the ¢,(P); since the system of these is complete, 
/(P) — k(P) is equivalent to zero, which contradicts (95), i.e. the 
necessity is proved. Let us prove the sufficiency. Given that the 
system is closed, we have to show that it is complete, i.e. that if all 
the Fourier coefficients of a function f(P) are zero, {(P) is equivalent 
to zero. Since the system is closed, we can write (89) for f(P), which 
gives us, since all the Fourier coefficients of f(P) are zero: 


f P(E) @ (de) = 0, 
€ 


whence it follows, by property 8 of [51], that /(P) is equivalent to 
zero. Notice that the orthogonalization process that we described in 
[IV; 38} can be applied for any system of functions y,(P) of Lz. 

Everything said above can be extended at once to the case of 
complex functions of L,. The fact that system (81) is orthonormal 
is now expressed by the equations 


— 0 for k#l 
P) pı (P) G (dag) = : 96 
a = fo ee (96) 
whilst the Fourier coefficients are defined by 
a, = | f (P) pn (P) G (dé). (97) 
é 


In further formulae, we always have to write the square of the 
modulus instead of the square of a function or number. For instance, 
the closure equation takes the form 


J IHP) PE a8) = Sjal. (98) 
¢ k=1 


The above theorems are retained, except that we have to consider 
the series of | c, |? instead of series (90). 

We must also deduce the so-called generalized closure equation. 
Let a, and bn be the Fourier coefficients of f(P) and g(P), and let 
system (81) be closed. The function /(P) + g(P) has Fourier coefficients 
a, + bn, whilst f(P) + ig(P) has coefficients a, + ibn. The closure 
equations for these are 


Sift 92 @ (as) = Sa, + b,l?, 
č n=1 


SI} + ig 2@ (ds) = X Jan + ibn, 
č n=l 


164 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [58 


or 


SEF Ige + dg + i] @ (de) = 
é 
solers [onl? + Gib, +o, bp)); 


fF (fl? +g + ilg — fg] G (de) = 


= S Lag? + [bn]? + ê (an bn — 4,8). 
n=l 


On taking into account the closure equations for f and g, multiplying 
the second equation through by ¢ and adding to the first, we get 
the generalized closure equation 


fH G(d8) = X anb. (99) 
g 


n=l 


In the ease of real functions the generalized equation becomes 
f {9G (ad) = S'a,b,. (100) 
č n=i 


An immediate consequence of the generalized closure equation is 
that the Fourier series of any function f(P) of L, can be integrated 
term by term over the set & or any measurable part of it g’ [I]; 156]; 
in other words, if a, (k = 1, 2, ...) are the Fourier coefficients of 
(P), then 

JIP) G (AB) = Sax Jo (P) G (d2). (101) 
(i k=1 č 

Let us indicate a further property of space L,, which implies the 
existence in L, of a closed orthonormal system. This property is 
usually known as separability and consists in the following: there 
exists a denumerable set of elements y,(P) (k = 1, 2,...) of L, 
dense in L, i.e. such that, given any f(P) of L, and any positive e, 
there exists an element »,,(P) of this denumerable set such that 
|| AP) — Yml P) || < £. We shall prove the separability of L, in a 
later section. We now show that separability implies the existence of 
a closed orthonormal system. On applying the orthogonalization 
process [IV; 38] to y,(P), we obtain some orthonormal system 
oP) (k = 1, 2,...). Let us show that it is closed. By what has 


59] THE SPACE J, 165 


been said, given any /(P) of L, and any positive e, there exists a 
wWm(P) such that || f — ym || < £. But, by virtue of the orthogonaliza- 
tion process, Ym(P) is a finite linear combination of ¢;(P), i.e. ym(P) = 
= ¢,9,(P) + c pP) + ... + ¢,9(P), and thus 


b ! 
If — Ym li? = f [/ (P) — > cop (P)]? G (dz) <e. 
a k=l 


If we replace the cy by the Fourier coefficients of f(P) with respect 
to system 9,(P), the inequality holds all the more [II, 148]: 


b 

S[#(P) — Sf]? G (da8) < æ, 

a 
where S,(f) is the segment of the Fourier series of f(P). Since e is 
arbitrary, this inequality implies that the system 9,(P) is closed. 

A further remark: when G(@) corresponds to concentrated masses 

at the points P,, P,,..., Pm, a closed system contains only m ele- 
ments, and this case is of no interest. As already mentioned, it reduces 
to a finite-dimensional space Rm. 


59. The space /,. The space l, of infinite sequences is closely con- 
nected with the space L,; here, we consider the complex case straight 
away. An element of /, is an infinite sequence of eomplex numbers 
L (2, Ta %3,...) such that the series of |, |? is convergent. The 
definitions of multiplication of an element by a complex number and 
addition of elements are obvious. By definition, an element cx has 
coordinates (cz,,c%,.-.) and the sum of elements x and y with 
coordinates x, and Yn has coordinates 2, + Yn; the convergence of the 
series of | 2, + Yn |? is an immediate consequence of the convergence 
of the series of | zn |? and | y, |?, in view of the obvious inequality 
| 8n + Yn |? < 2(| tn |? + | Yn |). The norm of an element z is given by 


{| 2 || = | Sieur. (102) 


n=1 


and the scalar product of elements v and y by 
(2, Y) = © tr fw (103) 
n=l 


the absolute convergence of the series on the right being an immediate 


1 
consequence of | £n Yn | < = (| Tn |? + | Yn P). 


166 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [59 


We have 
\| «|? = (a, z). (104) 


The distance between elements v and y is given by 


e(2,y)=Va@—y,2—y) =lle-all=|/ Sie, =t. (105) 


The following inequalities are precise analogues of (67) and (69): 


| Xar Ja? < Flen > |y}: (106) 
nal n=l n=1 
co co 5 oo 
D lta t mY Slet) Syn: (107) 
n=l n=l n=l 


They are proved in the same way as (67) and (69). Notice that 
(106) can be written as 
|æ y)|? < |]@ iP- lyi. (108) 
In view of (107), the triangle rule holds for distances. The zero 
element is the element all the coordinates of which are zero. We say 
that a sequence of elements z™ is convergent to an element z if 
|z — 2 |} +0. Let 2 be the coordinates of 2” and a, the 
coordinates of x. The convergence of x™ to a is equivalent to 


[z — 2 2 = Sa, — af 20 as N—> oO, (109) 
kel 


To discover the connection between spaces L, and l, we take 
any closed orthonormal system (81). For every function f(P) 
of L, there will be a corresponding sequence of complex numbers 
a, — its Fourier coefficients, the series of | a, |? being convergent. 
Conversely, for every sequence of complex numbers there is a corres- 
ponding definite function of L,, by virtue of Theorem 1. Thus, by 
taking a closed system (81), we establish a one-to-one correspondence 
between the elements of L, and l. Each element of L, corresponds 
to one definite element of l, and vice versa. Since the Fourier coef- 


m 
ficients of a finite linear combination of functions % cp f,(P) are equal 
kal 


to the corresponding linear combination of Fourier coefficients of the 
component functions /,(P), we can say that our one-to-one corres- 
pondence is distributive, i.e. if elements f,(P) (k = 1, 2, ..., m) of L 


m 
correspond to elements x of l, the element X cx fa(P) corresponds 
k=l 


59] THE SPACE |, 167 


m 

to the elements X cz“. By the generalized closure equation (99), 
k=l 

the scalar products of corresponding elements in L, and 1, are the 


same. By the closure equation (98), the norms of elements are also 
the same. Spaces L, and l, are different realizations of the same 
abstract space. We shall later investigate the properties of this 
abstract space and operators in it by describing the space with the 
aid of a system of axioms. The concept of mutual convergence in 
space l, must also be mentioned. We say that a sequence of elements 
x” is mutually convergent in l if, given any positive e, there exists 
an N such that || 2 — 2 || < e for m and n >N. On taking 
into account the correspondence between spaces L, and 1, and Theorems 
7 and 8 of [56], we can say that the necessary and sufficient condition 
for a sequence z” to have a limit in } is that it be mutually convergent. 
The limit is unique. 

We take the set K of elements of l, having only a finite number 
of non-zero coordinates, all these coordinates being rational complex 
numbers, i.e. numbers of the form a + bi, where a and b are rational 
real numbers. Since the set of rational numbers is denumerable, our 
set K is denumerable. Let us show that it is dense in l. Let x (21, %, 


Ty ..-) be an element of J, and z(c}, Cz ...,¢, 0,0, ...) be an element 
of the set K. We have 
n oo 
|e —2|P = Sla—a P+ X |z]. (110) 
k=l k=n+1 


Let « be a given positive number. Since the series of | x, |? is con- 
vergent, we can fix an n = n, such that 


D> lule< pe. 
k=ngtl 


In the finite sum 
The 
> |e — % |? 
k=l 


we can choose the rational numbers cg so close to x, that this sum 
is also < ¢?/2. We now have, by (110), || æ — z ||? < eZ i.e. || £ — z II< 
< e. This shows that the denumerable set K of elements of /, is dense 
in 4}. The space } is therefore separable. The elements of this space 


€,(1,0,0,...); @,(0,1,0,0,..-); e3(0,0,1,0,...);.. 


form a closed orthonormal system. 


168 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [60 


60. Lineals in L,. We now introduce some new concepts in con- 
nection with L. 

DEFINITION. A set U of elements of L, (of functions of L,) is called 
a lineal when the condition is satisfied: if o(P) and p(P) € U, then 
cg(P), where c is any real number, and o(P) + y(P) also belong to U. 

It follows from this definition that, if /,(P) € U (k = 1,2, ...,m), 
then ¢,f,(P) + ef,(P) +... + ¢mfm(P) € U for any choice of numbers 
cy. Let us notice some properties of lineals. Let M be a set of functions 
bounded on @ i.e. functions such that, given any /(P) € M, there 
exists a number d; such that | {(P) | < dj on Z. The set M is a lineal 
in L, 

We say that /(P) is continuous on a set g when the condition is 
fulfilled: if points P and P, (n = 1, 2, ...) belong to 8 and P, > P, 
then {(P,) — f(P) [IV; 157]. The set of functions of L, continuous on & 
is obviously also a lineal. 

THEOREM 1. A lineal of functions of L, continuous on a bounded 
set is everywhere dense in L,. We have to show that, given any 
element /(P) of L, and any 6 > 0, there exists a function ¢(P) € Ly, 
continuous on @, such that 


I4 — g]? = f [/ (P) — (PPG (az) < ô, (111) 
č 


where the two-dimensional case is considered, as above, for the sake 
of clarity. We have {(P) = f+ (P) — f-(P), where f+ (P) and f-(P) are 
the positive and negative parts of f(P). These functions, which belong 
to L, and hence are summable on @, can be assumed to take only 
finite values and to be the limit functions of increasing sequences 
of piecewise constant functions w} (P) and w; (P) with a finite number 
of values, where œw (P) < ft (P) and wn (P) < f-(P) [46]. We have [54]: 


lim f [f+ (P) — wg (P)PG (d8) = 0 
and ee 
lim f [f- (P) — w; (P/E (d8) = 0. (112) 


n= & 


Further, it follows from (zx, + 2)? < 2(a* + x?) that 
[} — (oe — wR)? < 2 (fF — w7)? +2 (fF — On)’, 


and, on bringing in the piecewise constant function w,(P) = œ} (P) — 
— œ (P) with a finite number of values, we can fix, in view of (112), 
an n such that || f — œn || < £» where eg is any given positive number, 


60] LINEALS IN L, 169 


On observing that || f— |] <||f—on|| + I| @n— ọ ||, we only 
need to show that there exists a continuous function (P) such that 
||@ — ọ || < £, where £ is any given positive number and w(P) is 
a given function with a finite number of values. Such a function can 
be written in the form 


w (P) = bS Cy Wz, (P), 


where w,, (P) is the characteristic function of the fixed sets 2, belong- 
ing to Z. If 9,(P) (k= 1,2,...,m) are functions continuous on 
Z and 9(P) = ¢,9,(P) + pA P) + .-. + ¢m@m(P), then 

m 3 
lo -— eli < Sle|llon —¢ li 


k=1 


and the proof of the theorem reduces to the proof of the following: 
given the characteristic function ws (P) of any measurable set Zo 
belonging to &, and any € > 0, there exists a function g(P) continuous 
on @ such that ||@—- ¢|| <€. We know that, given any £ > 0, 
there exists a closed set F belonging to Z, such that G(%, — F) < 
< ef [35]. Now: 


|oz,—@p|[?= f[we,(P) - op(P)PG(dF)= f| G(dF)=G(G)— F) <<; 
č 


&—F 


by virtue of the inequality || @,,— || < || œz, — ør || + I| @F — 
— ọ ||, it is sufficient to prove our last assertion for the characteristic 
function of a bounded closed set F belonging to 2. Let r(P) denote 
the distance from the point P to the set F. We have r(Q,) < 7(Q) + 
+ |QQ,! ond r(Q) < 7(Q,) + (9@,|, where |QQ,| is the distance 
between Q and Q,, whence it follows that r(P) is continuous. Further 
r(Q) = 0, when and only when Q € F [II; 89]. It is easy to see that 
wp(P) is the limit of a non-increasing sequence of functions con- 


tinuous on @: 
1 


Pn(P) = Ty nr (Py ’ (113) 


so that || or — pn || > 0 [54] as n+ ©, ie. given any £ > 0, there 
exists an n such that || or — pn || < £, where ,(P) is continuous 
on @; the theorem is therefore proved. 

COROLLARY 1. Let us confine ourselves for clarity to the case of 
a plane, and take @ as the closed interval A(a, < x < b; đa < Y < by). 


170 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [60 


Given any function 9(z, y} continuous on 4, we can form a poly- 
nomial p(x, y) such that | g(x, y) — p(x, y) | < £ on A, where e, is 
any given positive number. Now: 


ly — pl = Slee) — pee, y)}?G (d4) < BG (A). 


Using our theorem and the fact that ||f—p|] <({f—o||+ 
+ Ilp — p |l, where f(z, y) € L, on A, we can say that a lineal of 
polynomials is everywhere dense in L, on A. 

Instead of the interval 4, we could have taken any bounded closed 
set F, since a function continuous on F can be extended to a closed 
interval 4 containing F whilst preserving continuity [IV; 157] 
We have to take into account here that 


lly — pl = [ipte y) — p (z, y)? G (dF) < 
< f [p (z, y) — p (x, y)]? G (dd). 
4 


COROLLARY 2. Since the rational numbers are everywhere dense 
on the real number axis, given any polynomial p(x, y) there exists 
a polynomial with rational coefficients g(x, y) such that, given any 
€o > 0, we have | p(z, y) — q(x, y) | < €) on A or F. 

It follows at once from this that the polynomials with rational 
coefficients form a set everywhere dense in L, on A or F. 

Let us show that the set of such polynomials is denumerable. 
We associate with each polynomial g(x, y) a positive number: o = 
= n + r + s, where n is the degree of q(x, y), r is the least common 
denominator of its coefficients (r is taken as positive), and s is the 
sum of the absolute values of the numerators in its coefficients 
reduced to the denominator r (exception: if g(x, y) = 0, we associate 
with it o = 0). It is easily seen that the number of polynomials 
corresponding to the same ø is finite. We can enumerate all the 
polynomials with rational coefficients in order of increasing numbers 
o corresponding to them, the order of polynomials with the same o 
being of no consequence. We see that there exists a denumerable 
set of elements of L,, everywhere dense in L, i.e. L, on A or F is 
separable. 

Now let Z be any bounded measurable set. We take its closure %. 
This is a bounded closed set, and, a8 we have shown, there exists in 
L, on & a denumerable everywhere dense set 9;(z, y) (k = 1, 2, ...). 
These will be functions of L, on 8, and hence on &. Let us show 


61) EXAMPLES OF CLOSED SYSTEMS 171 


that p(z, y) are everywhere dense on @. We take some function 
f(a, y) of L, on & and extend it by zero onto &. It will also be of L, 


on @, and hence, given any «> 0, a p(x, y) of the denumerable 
set mentioned can be found such that 


SUF (2, y) — Pe (x, YI? G (AB) = f [F (2, y) — Pr (2, y) PG (dB) + 
z z 
+ f T(E, y) — p (x, y)]}G (dB) <e 


&e-—€& 
and all the more, 


f Lf (2, y) — Pk (x, y) PG (dg) <E 


č 


and so on. The separability of L, on any measurable set will be proved 
below. 

We shall prove a further theorem, which will be useful later. 

THEOREM 2. If the closure equation holds for every function of a 
set K, dense in L,, it also holds for any function of L, 

Let f(P) be an element of L, and € a given positive number. Since 
K is dense in L,, there is an element g(P) in K such that || f — 9 || < €/3. 
By hypothesis, the closure equation holds for ¢(P), so that we can 
take a segment sn(p) of the Fourier series of g(P) with respect to the 
orthonormal system of functions ,(P) such that || p — snp) || < e/8. 

On taking into account the equation f — s,(f) = (f — 9) + (p — 
— 8n(¢)) + (8n(~) — 8,(f)) and the triangle rule, we get || f — s,(f) || < 
<If—¢il+ile— say) ll + Il slp) — Snl(f) il, whence || f — 
— Ilf) || < 2e/3 + || salp) — s,(f) |]. But the difference salg) — s,(f) 
between the segments of the Fourier series for y and f is the segment 
of the Fourier series for p — f, i.e. 8,(~) — Salf) = sn(p — f), and, 
by Bessel’s inequality, || salp) — sn(f) || < || p — F || < ¢/3. Finally, 
Il f — Sp(f) || < 22/3 + £/3 = e, whence it follows, since e is arbitrary, 
that the closure holds for /(P), and the theorem is proved. 

Everything said above can be generalized at once to the case of 
complex functions of L, [58]. 


61. Examples of closed systems. We shall give some simple examples 
of orthonormal systems, closed in the finite interval [a,b]. 
If we apply the orthogonalization process to non-negative integral 
powers of x: 1, x, 27,... [IV; 38], we get a system of orthogonal 
polynomials p(x) (k =0,1,2,...) on the interval [a,b], where 


172 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [61 


p(x) is of degree k. Every polynomial p(x) of degree n can be written 
as a linear combination 


p(x) = X erpi (2). (114) 
kao 


To see this, it is sufficient to define cn in such a way that the coeffi- 
cient of z” on the right-hand side is the same as in p(x). We then 
have to define ¢p_, so that the coefficient of z"~* in the term ¢p_, Pn-(2) 
is the same as in p(x) — cnpp(x), and so on. The coefficients cg in 
(114) are obviously equal to the Fourier coefficients of p(x) with 
respect to the p(x). It follows from the exact equation (114) that, 
in the case of an orthogonal system p(x) the closure equation holds 
for any polynomial p(x), so that, by Theorem 2 of the previous section, 
the system of orthogonal polynomials is closed. We have seen above 
that, on the interval [—J, +/], with the orthogonal system 


sin = ; cos (n=0,1,2,...) (115) 


the closure equation is fulfilled for any continuous function [II; 148], 
whence it follows that system (115) is closed in Z,. Similarly, the 
orthogonal systems of functions 


nnz 


l 





sin ae (n=1,2,...) and cos 


are closed on the interval [0, 7]. 

We saw earlier [IV; 99], that in the case of the eigenfunctions 
px(x) (k = 1, 2, ...) of a boundary value problem, every function with 
y continuous derivatives up to the se- 
cond order and satisfying the boundary 
conditions can be expanded in a uni- 
yc formly convergent Fourier series in 
functions p(x). The closure equation 
will hold all the more for such func- 
tions. By varying the value of the func- 
tion in narrow intervals close to the 
r=a@ T ends of the interval, it may easily be 
Fio. 3 seen that the closure equations are 
observed for all functions with con- 
tinuous derivatives up to the second order, without requiring that 
the boundary conditions be satisfied at the ends. The closure equa- 
tion will be satisfied all the more for all polynomials, so that 

the system of eigenfunctions p(x) is closed. 


(n=0,1,2,...). 


62] THE HOLDER AND MINKOVSKII INEQUALITIES 173 


62. The Hélder and Minkovskii inequalities. In addition to the 
class L, a class often discussed is Lp, of measurable functions /(P) 
for which the pth power of the absolute value (or modulus for a 
complex function), i.e. | {(P) |P, is summable on & [cf. 55]. We shall 
first deduce inequalities for sums and integrals analogous to (67) and 
(69), with any index p greater than unity. 

Let a be a positive number. We take the curve y = x° on the XY 
plane and draw the straight lines x = a and y = b, parallel to the 
axes (Fig. 3). These straight lines, the axes and the curve bound 
two plane domains, having the areas: 


1 
pra 





a ra b 
8, = [atde =; S, = | ytdy = rc: 
ò ö Leo 


As is directly obvious from the figure, the sum of these areas is 
not less than the area ab of the rectangle with sides a and b, i.e. 


alta ba 
re Ta 





On writing p =l +a and p’ = 1 + l/a, we can rewrite the 
inequality as 


P b 
bee a 
ab < 5 + p’ (116) 
where p and p’ are obviously connected by the relationship 


l 1 
D + gee r (117) 

In view of the arbitrariness of the positive number a, inequality 
(116) holds for any positive p and p’ connected by (117). Both these 
numbers must evidently be greater than unity. If p = 2, then p’ = 2, 
and (116) reduces to the obvious inequality: 2ab < a? + 6%. It follows 
from Fig. 3 that the = sign in (116) holds when and only when the 
point of intersection of x =a and y = b lies on y = 2", i.e. when 
b = a°. Suppose further that the positive numbers a; and bp (k = 
= 1, 2, ..., n) satisfy the relationships 


n n 
D>Daf=1 and Yop =l. (118) 


kal k=l 


174 SET FUNCTIONS AND THE LEBESGUE INTEGRAL (62 


We put a = ak and b = by in (116). On summing over k and taking 
(117) and (118) into account, we get 


Said, <1. (119) 


k=l 


We now take any positive numbers a, and b, and write 


n 1 n 1 
Bees | Sap; B= Syl, (120) 
kal k=)! 





The numbers ag = aA and by = 6,/B obviously satisfy (118), 
and we thus have inequality (119) for them, whilst (119) can here 
be written as 


> Qk bk < AB 
k=l 


ie. 
Sab < P, (121) 


k=1 





apie 


On passing to the limit, we get a similar inequality for the infinite 
sums 





Sarb < [$a 


on the assumption that the series on the right are convergent. In this 
case, the series on the left must be convergent, by virtue of the 
inequality. Certain of the a, and bx may be zero. In the case of complex 
numbers, we can use the obvious inequality 


|Z rb] < Flail 
k k 





(že P. (122) 


k=l 


to write (122) as : 
| Z a] < (Zia P? (<5 1b ry i (123) 
k k k 


These inequalities are usually known as the Hölder inequalities 
for sums. When p = p’ = 2, they degenerate to the ordinary inequality 
(106) of [59]. Similar inequalities hold for integrals. Suppose that 
J(P) € Lp and g(P) € Lp. We have by (116): 


fF) 9(P)| < HOP 


The right-hand side is summable a A so that f(P) g(P) is 
summable, ie. if /(P) € Lp and g(P) € Lp, f(P)g(P) is summable 


P)” 
4 12|” de” | 


62] THE HOLDER AND MINKOVSKII INEQUALITIES 175 


(cf. Theorem 1 of [55]). The integral of /(P) g(P) satisfies a Holder 
inequality analogous to (67) of [55]: 


i 


[J fade ay) < (f IHP de ayy (fai ao ayy (124) 


this being written only for Lebesgue integrals (it is also true for 
Lebesgue-Stieltjes integrals). It is obtained in the usual way from 
(122) with the aid of a passage to the limit. Let 64 and 6) be inde- 
finitely diminishing sequences of Lebesgue subdivisions for | f | and 
|g | and ôn = 646% be the product of these subdivisions. Let Z¢” be 
the component parts of the set Z in the subdivision ôn, and min 
and Mg,n the strict upper bounds of |f | and |g | on & w. On taking 
(117) into account, we can write 


1 1 
= My, n Mg, n M (FE) = > mi nm? (FY) mj, n m” (BL). 


We now apply Hölder’s inequality with ay = minm! (ZP) and 
by = mi nm!” (BY): 


S riamiam EP) < (F miga m EPP (Zrim EPP _ (125) 


We write Myn for the strict upper bound of |f|] g] on (Z9). 
We obviously have Mg n < min Min, and it follows from (125) that 


FrnmEP) < (FZ En EPP (Zain CYP 


On passing to the limit for the sequence of Lebesgue subdivisions, 


we get 
2 


fli llgldzdy < (f FP dzavP (f |g dady)”, (126) 


whence (124) follows. 

Let us prove a further inequality, analogous to (69) of [55]. We first 
take the case of a sum. As above, let a, and by be sequences of positive 
numbers. On summing the obvious equation 


(ak + By)? = (ap + bp)? Ay + (Ay + by)? 4 bps 


we get 


D (ar HB)? = X (ar + bPa + D (ie H be). bk. 


k k 


176 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [62 


On applying Holder’s inequality to the sum on the right, we arrive 
at the inequality 


> (ar + by)? < Si aP (ay + byeoenp + 


k 


+2 wy (= (ay + ba) ioh: 


But, by (117), p’ = p/(p — 1), and the last inequality can be 
rewritten as 


= (ap + By)? <(2 (a, + bP) P 





[(Saty + (Soy 


On dividing both sides by the factor in front of the square bracket, 
we arrive at Minkovskii’s inequality for a sum: 


(= (a +) ry < (Say + (ZF. (127) 
k 

This inequality leads, precisely as above, to the Minkovskii integral 
nequality for f(P) and g(P) € Lp: 


(fta dz dy} > (f1 Pasay} + (flgPazayy (128) 


we have to notice here that |f+g|<|/|-+ |g |. Inequalities (127) 
and (128) have been deduced on the assumption that p > 1. They 
are obvious for p = 1, but cease to be valid for p < 1. 

By using the above inequalities, we can easily prove for the func- 
tion space Lp (p > 1) the properties that we had earlier for L,, 
the functions here being assumed complex. Let us recapitulate these 
properties, in the order of [55). If f(P) € Lp and g(P) € Lp, (p > 1), 
{(P) and f{(P)g(P) are summable on @. This follows from (124). 
If f(P) and g(P) € Lp and c is a constant, then cf(P) and f(P) + 
+ g(P) € Lp (p > 1). This follows from (128). A sequence of functions 
fn(P) of Lp is said to be convergent in the mean in Lp (p > 1) or 
convergent in the mean with index p to the function /(P) of Lp if 

lm {|f(P) — fa (P) G (d2) = 0. 
n=æ ë 

The limit in the mean in Lp is unique up to equivalent func- 
tions. If fa(P)— f(P) in the mean, a subsequence fn, (P) can be 
extracted from the sequence /,(P) such that it is convergent almost 


62] THE HÖLDER AND MINKOVSKII INEQUALITIES 177 


everywhere on @ to f(P). Mutual convergence is defined by a con- 
dition analogous to (72): 


Stn (P) — fm (P)|? G (dF) < € 
€ 


for n and m > N, and the necessary and sufficient condition that 
the sequence /,(P) be convergent in the mean to a function of Lp 
(p > 1) is that it be mutually convergent in Lp. 

If fh(P)—> f(P) in Lp and g,(P)— g(P) in Lp, (p > 1), then 


lim f fp (P G (dg) = sie G (dé). 


N+oog 


We can also introduce the norm in Lp (p > 1): 
i 
IF ll == (Seen 
ë 


and the distance between two elements 6(f, g) = || f—g ||, where we 
have || cf(P) || = | c | || f(P) || and the triangle rule. 
Let us show further that, if g > p and f(P) € Ly, then f(P) € Lp- 
By hypothesis, 
JIP )PG(dF)=A<+oo. 


ë 
We consider the integral: 
JIPP GAE) = f IAP PGUZ)+ f [F(P)P G (d8) < 
z IES a(\S(P)>1) 
< JG (48) + f |/(P)1G (dF) =G (8) + A, 
é ë 


whence it follows that f(P) € Lp. In the proof, we have used the fact 
that the measure G(Z) of the set Z is finite. 

But in Lp (with p # 2), we do not have the scalar product that 
we had in L, 

A space J, can be formed in the same way as l, in which the elements 
are infinite sequences of complex numbers (£, £z, ...) such that the 
series formed from |æ |? is convergent. It has properties analogous 
to those of l, when p > 1, the resemblance being the same as that 
of L, to L,. There is no scalar product in 1, (p # 2), and the connection 
with L,,such as we established betweenL,and l, is missing. Inequalities 
(106) and (107) are replaced by (122) and (127), in which a, = | zx | 
and by = | yx |. 


178 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [63 


63. Integral over a set of infinite measure. We have so far con- 
sidered the integral over a measurable set & of finite measure. The 
integral can be extended to the case of a set of infinite measure in 
much the same way as the Riemann integral was extended to the 
case of an infinite interval. Let {(P) be a measurable and non-negative 
function given on a measurable set & of infinite measure. We take 
an increasing infinite sequence of sets of finite measure 


ECC fC.” (129) 


for which @ is the limiting set. The sets Zn can be formed say from 
the products of the set @ with the intervals 4, (—n < æ < +n; 
—n < y < +n). The integrals 


f E(P) G (de), (130) 
čn 


exist for the bounded sets, and do not decrease as n increases because 
/(P) is non-negative. The limit of the monotonic sequence (130) is 
defined as the integral of /(P) over Z: 
Sf (P)G (d&) = lim f f(P) G (dz). (131) 
č n= ën 

Notice that integrals (130) may be equal to (+ °°). In this case 
the integral of f(P) over Z is obviously also (+20). It may happen 
that all the integrals (130) are finite, whilst the integral over @ is 
(+ °°). To justify the above definition of the integral, we have to 
show that the limit of the numerical sequence (130) does not depend 
on the choice of monotonic increasing sequence of sets En. 

THEOREM. Integrals (130) have the same limit whatever the choice 
of the increasing sequence of measurable sets En of finite measure tending 
to Z. 

We use reductio ad absurdum. Let Zi c Z} c &3 c ... be another 
increasing sequence of sets of finite measure having % as the limiting 
set and such that the sequences of integrals (130) have different 
limits for sets 2, and Z: 

lim f /(P)G(dZ)=a and lim {f(P)G(d%) =b>a. (132) 


No Ey n= Ey 
The number a is always finite, and we have 
f H(P)G(d%) <a (133) 
on 


(n=1,2,...) 


63] INTEGRAL OVER A SET OF INFINITE MEASURE 179 


Suppose first that the number b is finite. 
Having chosen the positive number c < b — a, we can fix a value 
of the positive integer m such that 


jie G(d&) >a +c. (134) 


Since /(P) is non-negative, we have 


f f(P)G (ab) <a. (135) 


em en 


We consider the sets Zn n- They increase as n increases, and since 
Z is the limiting set for Zn, the limiting set for %/,%, will be 
Z'a, whence it follows that 

lim G (Zn — #1, %,,) =0. (136) 
Noo 
Since b is finite, f(P) is summable on n, and in view of (136) 
and the absolute continuity of the integral of f(P), we have 


lim f f(P)G(d%) = | f(P)G (dé) 
Em 


n= co Em én 


which contradicts inequalities (134) and (135). If b= + we take 
[/(P)]y instead of f(P), oe N and m so large that 


ile G(d&) >a+1. 


By (133), we also have 
f U(P)lnG (de) <a 


, 
em Ën 


The previous argument leads us to a contradiction, and the theorem 
is proved. 

If the integral of a non-negative function f(P) over & has a finite 
value, we say that f(P) is summable on Z. It follows from this and 
the above definition that, if f{(P) is summable, and the non-negative 
function ¢(P) satisfies g(P) < {(P) on &, o(P) must also be summable. 
We now take a function measurable on 8 that can vary in sign, and 
split it into positive and negative parts: 


FIP) = f+ (P) —f- (P). (137) 


180 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [63 


The function /(P) is said to be summable on @ if f+(P) and f-(P) 
are summable. The value of the integral is now given by 


f FP) = f f (P) G (de) — | F (P) G (a2). (138) 
g č č 


If only one of the functions f+(P) and f-(P) is summable, the 
integral of f(P) still has a meaning, as in [52], but its value will be 
(+2) or (~). For the most part, the set J on which the integration 
is performed is the whole of the plane or the whole of a straight line 
or in general the whole of n-dimensional space. 

The theorem of [52] and properties 1, 2, 3, 4, 5, 6, 7, 9 and 10 
hold for integrals on a measurable set of infinite measure. We shall 
only prove the complete additivity and absolute continuity. The 
proof of the theorem and of the remaining properties is extremely 
simple. As a preliminary we must prove a simple lemma. 

LEMMA. If the non-negative numbers af do not decrease as s increases 
and lim af = ap, on writing 


S— oo 


a) = D al), 


k=1 
we have 
lim a) = Say. (139) 
S-ro k=l 
We use reductio ad absurdum. Suppose that the sum written can 
have the value (+20). Writing a for the limit of a®, we suppose 
first that 
a> > a. 
k=1 


For sufficiently large s, we have a > c, where c is the sum of 


series (139), and, having fixed such an s, we can fix so large an m 
that 


WA 


~Y af) > > Aks 
1 k=1 


x 
I 


so that all the more: 
m co 
> % > > % 
k=l k=l 


which contradicts a; > 0. Now suppose that 


63] INTEGRAL OVER A SET OF INFINITE MEASURE 181 


We therefore have, for some fixed m: 


a>a. 


ths 


We can now choose so large an s that 


m 
sap >a. 


k=l 


This finite sum is obviously < a®, so that a® > a, which is 


absurd, since the sequence a® tends to a without decreasing. The 
lemma is proved. 

We now show that the integral is completely additive. Let f(P) 
be summable on @ and let us divide this set into a finite or denumerable 
number of measurable sets 2, of finite or infinite measure. Now, 
f(P) will be summable on each &;. Suppose further that Z c gZ® c 
C ... is an increasing sequence of sets of finite measure tending 
to Z. We introduce the sets ZP = %,% of finite measure. They 
increase as s increases, lim ZO = Zr and 9 = ZO 4 ZO 4 ZPL 


+ ..., the sets on the Hebt-hand side having no points in common 
with each other. We have for the sets © of finite measure: 


f F(P)G (a8) = X f HP) (a8). 
k 


ge) =1 afs) 


On assuming f(P) positive for the moment, passing to the limit 
in this formula as s— oo, and using the lemma, we obtain (20) 
of [49]. Our assertion holds in the general case on the basis of (137) 
and the fact that it holds separately for f+(P) and f-(P). 

Property 6 of [49] is proved similarly. Let us show that the integral 
is absolutely continuous. We take f(P) >0 and summable on @. 
Given £ > 0, we choose m so large that 


jG (d8) <=. (140) 
ggm) 
We can write for any set e contained in 2: 


f fG (d8) = ay G(d®) + f fE(dé). 
(¢- —EMe 
In view of the absolute continuity of the integral on the set gZ“ 
of finite measure, there exists an 7 > 0 such that the absolute value 
of the integral over Se is not greater than e/2 for e€ gZ and 


182 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [64 


G(e) < n. On taking (140) into account, we can assert that the same 
is true for the integral over (Z — %”)e, whence it follows that the 
absolute value of the integral over e is not greater than e if e c & 
and G(e) < n, which proves that the integral is absolutely continuous. 
Theorems 1, 2, 3, 4 of [54] are also easily extended to the case 
of a set & of infinite measure. We shall prove Theorem 1 as an example. 
Let « be a given positive number. We choose so large an m that 
f F(P)G(dé) <e. (141) 


gmg) 
We now write an inequality for the integral of {(P) — f,(P): 
fE- < 


č 





< f inh (dB) Ste) G1A2)) (142) 
g=) Í 


gim 
We use the inequality |f — fh| < 2F on the set g — gZ™, and 
obtain, by (141), 
| f Uf — fa) 6 (dB)) < Jea) < f 2FG(d8) < 2e. 
€-—¢™ m) 


ggm) ¢—el 





Theorem 1 is already proved for the sets ¥” of finite measure, 
so that an N exists such that, with n > N, the first term on the 
right-hand side of (142) is < «. We thus have 


RR CENS for n>N, 
č 





which proves the theorem, inasmuch as e is arbitrary. The remaining 
theorems of [54] are proved similarly. 


64. The class L, on a set of infinite measure. The formation of 
the class L, and the theory of orthogonal functions may be carried 
over easily to the case of a set @ of infinite measure. We say that a 
function f{(P) on a set & of infinite measure belongs to L, if it is 
measurable on Z and its square /?(P) or the square of its modulus 
| UP) |? is summable on &. All the theorems of [55] remain in force, 
except for Theorem 1. The finite measure of Z was an essential factor 
in Theorem 1. An example can easily be given of a function of L, 
which is not summable. For instance, 1/x belongs to L, in the interval 
[1, œ], since 1/2? is summable, but 1/z is itself not summable. We also 
used the finite measure of Z in the proof of Theorem 8. Let us show 
that the theorem remains true when the measure of @ is infinite. 


64] THE OLASS Z, ON A SET OF INFINITE MEASURE 183 


For clarity, let be the complete plane %,,. Let the sequence /,(P) 
(n = 1,2, ...) of functions belonging to L, on Z. be mutually con- 
vergent. Let Am be the interval defined by —m < x < m, —m< 
< y < m. The functions /,(P) belong to L, and aremutually convergent 
on each Am, since the integral of a non-negative function over 4m 
is not greater than the integral of the same function over the whole 
plane. It follows from Theorem 8 [56] that a subsequence FOP), 
JEP), ... can be extracted from the sequence /,(P) such that it is 
convergent almost everywhere on 4,. We can extract from this sub- 
sequence a new subsequence fa, È, ..-, Which is convergent almost 
everywhere on 4,, and so on. It is readily seen that the subsequence 
AYP), fO(P), ..., is convergent almost everywhere on & [IV; 15]. 
Let f(P) be the limit function for this subsequence. Since the sequence 
fn(P) is mutually convergent on @,,, for any given € > 0 there exists 
an N such that 


S [AR (P) —fa(P)PG(d®) <2, for n and n>N. 
Čo 


If k tends to infinity, we obtain, as in [56]: 
| [/(P)— fh (P)O (AZ) <e for n>N, 


fo 
which proves Theorem 8. 
If (P) € L, on Za and we are given €p > 0, there exists an NV 
such that 
0< f P(P)G(8)< e. 


Eo—AyN 


We define the function »(P) as follows: y(P) = f(P) in An and 
y(P) = 0 outside Ay. Obviously y(P) € LZ.) and 


If — vllo = SU(P)—vp(P)PGds) = f f(P)G(d8) <&, 
Ë œ fa—Ay 
whence it is clear that the lineal of functions »(P), differing from 
zero only on some finite interval, is everywhere dense in L,(&..), i.e. 
in L on a. 

Let us show that L,(%..) is separable. As shown above, there 
exists on Am (m= 1, 2,...) a set of functions Pp m(P) (k, m = 1, 
2, ...) everywhere dense in Z,. We thus obtain a denumerable set of 
functions 9;,m(P) of L,(@..). It is easily seen that they are everywhere 
dense in L,(%..). For, let f(P) € L.(@.); given an e > 0, there exists 


184 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [64 


an m, such that 
§ PPG) <> 
& co— Aig 
and, by what has been said, a function Øx,m (P) can be chosen from 
the above-mentioned denumerable set such that 


f [F (P) — Pr, ma (P)]?2 G (AB) < 5 
Ams 


and hence, since #k,m (P) = 0 outside Am, 
S [F (P) — Pr, m (P)]? G ( a2) < £, 
fo 


which proves that L,(%..) is separable. 

We now show that the lineal of continuous functions y(P), vanishing 
outside some finite interval (different for different g(P)), is everywhere 
dense in L (Ë o). 

This lineal is usually called the lineal of finite continuous functions. 

Let f{(P) € Z,(@..); given € > 0, we show that there exists a finite 
continuous function g(P) such that || f — 9 Iln < £. As we have seen 
above, there exists an m such that |] f |l--4, < ¢/2. Having fixed 
this Am, we can say that there exists a function ¢(P), continuous 
in Am, such that || f— @ lla, < £/2. On writing M = wer | o(P) |, 

P 


and given any h > 0, we put g(P) = 0 on the founding "of Am+h 
and continue p(P) onto the whole of 44 whilst retaining the continuity 
and without exceeding max | ¢(P) | [IV; 157]. Outside 4,4, we put 
(P) = 0, so that (P) is a finite continuous function. On taking 
into account what has been said, we obtain 


IF — ollo = lf — P llin + ia A + 2 || fll 24m + 
+ 2||e |o- 4n < Epe oh 2 || wie 


But || p TE is equal to the eres of | p(P) |? over Amen — 4m- 

We have for the Lebesgue integral: 
lols- = f |p) Pda dy < M? (h? + 2mh), 
Am4h— ân 

and we choose h such that M?(h? +- 2mh) < «7/8, after which we get 
{lf— ọ ll, < £, which is what we had to prove. A similar statement 
holds for the Lebesgue-Stieltjes integral. 

Theorem 2 of [60] is proved as above. Notice that, for the Lebesgue 
integral, polynomials do not belong to Z,(@..). If Z is any measurable 


65] AN INTEGRATING FUNCTION OF BOUNDED VARIATION 185 


set, on prolonging a function of LZ) on to &,, by zero, we get a 
function of Z,(@.,). On starting out from this, we can extend everything 
said above to the case of any unbounded measurable set. 

We can quote as an example of a closed orthogonal system on 
the interval (—œ, +œ) the Hermite functions [III,: 156]: 


la df 
ne (2) =(— 1)že? “dak (e°*’) 


and on the interval (0, œ) the Laguerre functions [II,; 160]: 


1 
=x qk _ 
Pr (X) =e? ga (me *) i 


Both examples refer to the Lebesgue integral. 

A simple proof of the fact that these systems are closed may be 
found in Volume 1 of Hilbert—Courant: Methoden der Mathematischen 
Physik (Interscience, N. Y., 1931, 1937). 

What has been said about L,(%..) can be carried over at once to 
Lp(a) (p > 1), as was the case for a bounded set, and to the case 
of complex functions. 


65. An integrating function of bounded variation, We have so far 
assumed, when investigating the Lebesgue—Stieltjes integral, that the 
function G(@) is non-negative. We turn next to the case when the 
integrating function G(@) is obtained from the function of an interval 
G(A), which is a function of bounded variation. We have the canonical 
form for such a function, as the difference between two non-negative 
functions: 

G (A) = G, (4) — G, (4), 
where 
G,(4)=+[V (4) +6(4)];  @,(4)=+-[V (4)—@()], 

and V(4) is the total variation of G(4) on the interval 4. Each of 
the functions G,(4) and G,(A) leads to a non-negative, additive and 
normal function G,\(#) and G,(@) on the closed fields of sets Lg, 
and Lg, Let us write Lg for the closed field of sets forming the 
common part of Lo, and La, The complete additive and normal 
function 


G (g) = G (Z) — G, (@) 
is defined on this field. 


186 SET FUNCTIONS AND THE LEBESGUE INTEGRAL {65 


We take the non-negative, additive and normal interval function 
V (4) = G, (4) + G, (4). 


Its extension leads to a function V(#), defined on the closed field 
Ly. On using the last formula and the fact that G;(4) is non-negative, 
it is easily shown that Ly is the common part of Lg, and Ig, i.e. 
Ly coincides with Lg. We first have to show that, given any set Z, 
its exterior measure with respect to functions V(4), ie. | Z |y, is 
equal to the sum of its exterior measures with respect to G,(4) and 
G,(A), ie. |Z ly = |Z |e, + |Z lc, It is then easily shown, by using 
the definition of measurability, that, if Z is measurable with respect 
to V(4), & is measurable with respect to G,(4) and G,(A); and also, 
conversely, if Z is measurable with respect to G,(4) and G,(4), it is 
measurable with respect to V(4). When integrating, we have to con- 
sider the class of functions /(P), measurable with respect to V(4), 
i.e. the class of functions measurable with respect to G,(4) and G,(4). 
The integral is naturally defined by 


Srp G (df) = JEP 6, (a8) — sre )G, (d2), 


and its existence is guaranteed by the existence of the integrals on 
the right, on the assumption that both these integrals have finite 
values. Otherwise, the right-hand side may reduce to an indeterminate 
expression. Two functions are said to be equivalent if they are equiv- 
alent with respect to V(@). The properties of the integral 1, 4, 5, 7, 
8, 9 of [52] are retained without change. In property 3, instead of 
inequality (16) we have 


| S P008) < JIAP) (a8). 
ë ë 


In property 6, instead of the convergence of series (49), we must 
require the convergence of 


Š fI/P)|V as), 


k=l & 
and finally, in property 10, instead of inequality (50) we have 
| Pema < ff (P)V (da2). 
ë é 


66] THE REDUCTION OF MULTIPLE INTEGRALS 187 


By the definition of the integral, summable functions are functions 
summable with respect to V(@). Theorems 1 and 2 of [54] about 
passage to the limit are retained without change. 

The concept of integral is also easily extended to the case when 
G(4) is a complex function: 


G (A) = G' (A) + G" (4) î, 


where G’(A) and G’(A) are functions of bounded variation. On using 
the canonical forms of these functions: 


G' (4) = Gi (4) — Gz (4); G" (4) = G4 (4) — G” (4), 
we arrive at the formula 
A) = (Gi (4) — G3 (4)) + (GI (4) — G3 (4))i. 


The function G(4) leads to the function G(#), defined on the closed 
field Lg, which is the common part of the closed fields Lo; and 
Leo; (i = 1, 2). The definition of measurable functions with respect 
to G(@) and of the integral are essentially the same as above; an 
integrable function may be complex. 

In the case of one variable, we have the canonical form for a function 
of bounded variation: g(x) = g (£) — g,(x), where the last two functions 
are non-decreasing; the integral is written as 


f f (a) dg, (x FI x) dg, (z) — Jit) ) dg, (x). 


If we introduce the total variation v(x) = g(x) + g(x), we have 
the inequality 





z) dg (=| < J | (x) | dv (x) 
č 


and the summability of f(x) with respect to g(x) and g(x) is equi- 
valent to the summability of f(x) with respect to v(x). 


66. The reduction of multiple integrals. We turn now to a discussion. 
of the basic result of the theory of Lebesgue multiple integrals, 
concerning the reduction of a multiple integral to a sequence of 
simple quadratures. Let us recall the corresponding results of the 
earlier theory of multiple integrals [II; 97]. If, e.g. the function 
f(x, y) is continuous on the finite closed interval 4 (a <x < b; c < 
< y < d), the following formula holds for reducing the double integral 


188 SET FUNCTIONS AND THE LEBESGUR INTEGRAL (66 


to two quadratures: 
b d d $ 
f SF y) dedy = f [ff (zy) dy]dx =f [ | f (œ, y) dz] dy. 
4 a e c a 


We next state the analogous theorem for the Lebesgue integral. 
It was first proved by the Italian mathematician Fubini in 1907. 
FUBINI’S THEOREM. Let f(x, y) be a swnmable function on the finite 
interval A (a <x <b; c <y <d). Here, f(x,y) is summable with 
respect to y on the interval [c, d] for almost all values of x of [a,b], the 
function 
d 
h (x) = Sf (x, y) dy, (143) 
c 
defined almost everywhere in [a,b], is summable over this interval, 
and we have the equation 


b d 
$ {Fæ y) dady = f [ | f (æ, y) dy] dx. (144) 
A a c 


Similar statements hold when the order of integration is changed. 
In this case we have 


d b 
$ [F(x y) dedy = f [ff (æ, y) da] dz. (145) 
4 ¢ a 


Notice that the integrals in the theorem are understood in the 
Lebesgue sense, and the summability of functions must naturally be 
also understood in this sense. The assertion of summability obviously 
includes the assertion that the function is measurable. It should be 
noticed that function (143) is defined almost everywhere, but not 
necessarily at every point, on [a, b]. A similar remark applies for the 
function 


b 
L(y) = f f (x, y) dz. (146) 


In order to clarify the proof, which is fairly difficult, we have 
stated Fubini’s theorem for a particular case. We shall indicate later 
the various more general statements of the theorem. The proof must 
be preceded by several lemmas. 

Lemma 1. If Fubini’s theorem holds for functions f,(x, y), falt, Y) -» +s 
Ím(2, Y), summable on the interval A, it holds for any linear combination 
of these functions: 


f(x,y) = 5 cr fr (2, Y) (147) 
k=1 


66] THE REDUCTION OF MULTIPLE INTEGRALS 189 


Each of the f(x, y) is summable by hypothesis with respect to y 
on [c, d], if we exclude from the interval [a, b] of variation of x some 
set A, of measure zero. If we exclude from [a, b] the set 4 = A, + 
+ 4A, +... + Am, which is also of measure zero, for the remaining 
values of 2 function (147) will be summable with respect to y on 
[c, d]. All the functions 


d 
hy (x) = f fx (x, y) dy 


e 


will be defined on [a,b] except for points of set A. Further, (144) 
holds for the f(x, y) by hypothesis. If we use the rule for integration 
of a sum and take the constant factor outside the integral, (144) will 
be seen to hold for function (147), and the lemma is proved. 

Note. If all we are given about the f(x, y) is that they are 
measurable with respect to y on [c,d] for almost all values of z of 
[a, b], the same can evidently be said of function (147), since the 
sum of measurable functions is also measurable. It is naturally 
assumed. here that the sum has a meaning [43]. 

Lemma 2. Let f,(z, y) be a monotonic sequence of summable functions 
on the interval A, convergent to f(x,y), summable on A. If Fubini’s 
theorem holds for each of the f,(x, y), it holds for the limit function 
f(z, y). 

We shall assume in the proof that /,(z, y) is a non-decreasing 
sequence. The case of a non-increasing sequence reduces to this case 
by replacing f,(z,y) by —f,(z, y). By hypothesis, each f,(x, y) is 
measurable and summable with respect to y on [c, d], if we exclude 
from the interval [a,b] of variation of x a set A, of measure zero. 
If we exclude from [a,b] the set A = A, + 4, + ..., which also 
has measure zero, for the remaining set of x the limit function f(x, y) 
will also be measurable with respect to y on [c,d]. By hypothesis, 
each of the functions 


d 
hn (£) = Í fa (2, y) dy (148) 


is defined on [a,b], if we exclude the set A, of x of measure zero. 
If we exclude from [a,b] the set A, which also has measure zero, 
all the functions (148) will be defined for the remaining g, i.e. will 
be defined on the set [a,b] — A, and, by hypothesis, are summable 
on [a,b]. The sequence h,(x) is increasing, and we can define the 
limit function A(x) = lim h,(x), measurable almost everywhere on 


n= oa 


190 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [66 


[a, b]. On recalling that Fubini’s theorem holds by hypothesis for 
the f,(z, y), and that the limit function f(x, y) is summable on 4 by 
hypothesis, we can write 


b 
Shn (x) da = f { fn (£, y) da dy < f f(x,y) da dy. 
a 4 4 
Hence, by Theorem 2 of [54], we can say that h(x) is summable 
on [a,b], and we have the formula 


(h(a) ax = lim f h, (a) dx = lim f f f,(2,y) dedy. 
a 4 


foo a n= 


On the other hand, by Theorem 2 of [54], we have 
lim f ffn (x,y) dx dy = f f f (x, y) dx dy, 
4 4 


N—-poo 


and we can therefore write 
b 
fh(x)dz = f f f(x, y) dz dy. (149) 
a 4 
We have defined h(x) as follows: 


d 
h(x) = lim h (x) = lim {f, (x,y) dy. 
Moo Mr c 
To complete the proof of the lemma, we still have to show that 
f(z, y) is summable with respect to y on [c,d] for almost all x of 
[a, b], and that A(x) can be expressed almost everywhere on [a, b] by 


d 
h(x) = f f (z, y) dy. (150) 


After proving this, we obtain Fubini’s theorem in full for f(z, y), 
by (149). Let B be a set of points of [a, b] at which h(x) is defined 
and equal to (+). Since k(x) is summable, B is of measure zero. 
If we exclude from [a,b] the set A + B of measure zero, on the 
remaining set, i.e. almost everywhere on [a, b], the increasing sequence 
h,(x) tends to h(x), which takes finite values, i.e. for every x of the 
set (a, b) — (A + B), the integrals over [c,d] of the non-decreasing 
sequence of functions f,(z, y) of y are bounded by the number A(z). 
By Theorem 2 of [54], for these values of v, f(x, y) is summable with 
respect to y on [c, d] and we have 


d d 
ffiz y)dy = lim f f, (x,y) dy, 


c n= č 


67] THE CASE OF THE CHARACTERISTIC FUNCTION 191 


and, by (148), A(x) is given by (150) almost everywhere on [a, b]. 
The Jemma is therefore proved. 

Note. It follows immediately right from the start of the above 
proof that, if we are only given that the /,(x,y) are measurable 
with respect to y on [c, d] for almost all x of {a, b], the limit function 
f(x, y) is measurable with respect to y for almost all x of [a, b]. 


67. The case of the characteristic function. The aim of the present 
section is to prove Fubini’s theorem for the case when the integrand 
is the characteristic function of some measurable set @ belonging 
to the interval 4, referred to in the theorem. The integral of w,(P) = 
= W,(x, y) obviously gives the measure m(Z@) of the set Z as a set 
on the plane. Let Z, be the set of the points of Z which have a given 
abscissa 2%, ie. Zx, is the intersection of Z with the straight line 
z = zo The characteristic function of this set is equal to w;(Xo, Y). 
The measurability of x, with respect to y is equivalent to the 
measurability of w,(2, y) on the interval [c, d], and if this is the case, 
the linear measure of @,,, which we denote by m’(Z;,), is equal to 
the integral of w; (£o, Y) over the interval in question. The summability 
of ws (£o y) is guaranteed by virtue of its being bounded. Fubini’s 
theorem for the characteristic function w,(z, y) thus reduces to the 
following: w,(x, y) is measurable with respect to y on the interval 
[c, d] for almost all x of [a, b], the bounded function 


d 
h(a) = m (x) = | œ (x, y)dy, 
e 
is measurable in [a,b], and 


bd 
m(&) = [| [ | œ (x, y)dy] dx. (151) 


More briefly, Zx is measurable with respect to y for almost all 2, 
and 


b 
m (8) = | w (€,) dx. (152) 


We shall prove Fubini’s theorem for the characteristic function in 
stages. 

Lemma 3. Fubini’s theorem holds for the characteristic function of 
any semt-open interval, open set and a set G, belonging to A. 


192 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [67 


If dla <s <; y <y <6] is a semi-open interval belonging to 
A, wy (z, Y) is measurable with respect to y for any z: 


d 
h(x) = | wy (£, y) dy = ô — y, 


if a< x<, and h(x)=0 if xis outside J’, 


and the lemma is obvious, since the measure of 4’ is equal to (£ — a) 
(6 — y). The open set Z, is the sum of a denumerable number of 
semi-open non-overlopping intervals 4, and 


We, (z, y) ER 5 Wa, (x, y). 


k=0 


By lemma 1, Fubini’s theorem holds for the finite sum 


m 
> OA, (x, y) : 
k=l 
As m increases, these sums form a non-decreasing sequence, which 
tends to a bounded, and consequently summable, function wz, (£, y), 
and, by lemma 2, Fubini’s theorem also holds for w, (x, y). Suppose, 
finally, that Z4 is a set G, belonging to the open interval 4. We can 
write it in the form 
Za = [J 0r (153) 
k=1 
where O, are open sets belonging to the open interval A. Notice 
that, if certain O, did not belong to the open interval 4, we should 
be able to replace O, by the product of Op with the open interval 4. 
By (153), œx (x,y) is the limit of a non-increasing sequence of 
characteristic functions wy (z, y) of the open sets 


m 
Em= TT On, 

k=l 
and since Fubini’s theorem is already proved for w, (x, y), it holds 
for œ, (x, y), by lemma 2. If certain of the points of the set Zo of 
type G, lie on the contour of 4, we somewhat widen 4 so that #5 
lies inside the widened interval 4). Fubini’s theorem holds for œ; (x, y) 
on Ay. Hence, in view of the fact that w,(z, y) = 0 outside A, we at 
once obtain Fubini’s theorem for œw (x, y) in the interval 4. Notice 
that, in all the cases discussed in this lemma, w, (x, y) is measurable 
with respect to y for all v. 


67] THE CASE OF THE CHARACTERISTIC FUNCTION 193 


Lemma 4. If & is a set belonging to A and having a plane measure zero, 
the linear measure of Ey is zero for almost all x of [a,b], and Fubini’s 
theorem holds for w,(zx, y). 

We form the set Z6 of type G, belonging to 4, covering & and such 
that mZ — Z) = 0 [40]. We have Zo = Z + (Zó — Z), and since 
m(Z) = 0 and m(&4 — Z) = 0, we have m(% 4) = 0. Fubini’s theorem 
holds for Z, and we can write 


b d 
f [S oe (a, y) dy] da = 0. 


The quantity in square brackets is non-negative, and, by property 
14 of [49], we have almost everywhere on the interval a < z < b: 


d 
f we (x,y) dy = 0. 
c 
It is clear from this that the linear measure of the set of points 
of Z4, lying on almost all straight lines parallel to the Y axis, is zero. 
Since  c g4, this holds all the more for the set Z, i.e. at almost 


all x of [a, b], 
d 
f We (x, y) dy a) 0, 


c 


so that Fubini’s theorem holds for w,(z, y): 
b d 
m(Z) =0 = f [ f w (x, y)dy] dz. 


Lemma 5. Fubini’s theorem holds for w,(x, y) of any measurable set 
Z belonging to A. 

We form the set % of type G, belonging to 4, covering % and 
such that m(Z; — Z) = 0. By lemmas 3 and 4, Fubini’s theorem 
holds for the characteristic functions of the sets Fj and (24 — Z). But 
(X,Y) = Wg, (£, Y) + Ose (T2, Y), and by lemma 1, Fubini’s theorem 
holds for w,(z, y). 

Notice that, if the measurable unbounded set & has finite measure, 
the &, are measurable for almost all z, and (152) holds. This follows 
at once by a passage to the limit from the bounded sets. It is easily 
shown in the same way that, if Z is simply measurable, %,,is measurable 
for almost all x. If, in addition, m(%,) is summable, g has finite 
measure, and (152) holds. Lemma 4 obviously also holds for unbounded 
sets. 


194 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [68 


68. Fubini’s theorem. We still need a further simple lemma for 
the complete proof of Fubini’s theorem. 

Lemma 6. Fubini’s theorem holds for a measurable function f(x, y) 
that takes a finite number of finite values in A. 

Let f(z, y) = cy (k = 1,2, ...,m) if the point (x,y) belongs to 
the set Zp, where d= %,+ %,+...+8,,. We can write f(x,y) 
as a linear combination of the characteristic functions of the sets &,: 


f(x,y) = 5 Cre, (E, Y), 


k=1 


and it follows from lemmas 1 and 5 that Fubini’s theorem holds 
for f(x, y). 

Fubini’s theorem can be proved very simply on the basis of the last 
lemma. Let f(x, y) be summable on 4. H split it into positive and 
negative parts: f(x, y) = f+ (x, y) — f-(x, y). By lemma 1, it is sufficient 
to prove the theorem for f* and f-, i.e. we can assume in the proof 
that the summable function f(x, y) is non-negative. As we know 
from [46], such a function can be written as the limiting function of 
a non-decreasing sequence of measurable non-negative functions 
fr(z, y) with a finite number of values. By lemma 6, Fubini’s theorem 
holds for the /,(z, y), so that, by lemma 2, it also holds for f(z, y) 
and the theorem is proved. 

Notice that we assume the f(z, y) in the theorem to be summable 
on the interval 4. Given this condition, the quadratures on the right- 
hand sides of (144) and (145) have a meaning by virtue of the theorem, 
and give the double integral of f(x, y) over 4. The converse conclusion, 
that the double integral exists if the quadratures on the right-hand 
sides have a meaning, may be false. Examples may be quoted in 
which the iterated integrals on the right-hand sides of (144) and 
(145) have a meaning, and the results are equal to each other, yet 
f(z, y) is not measurable on 4, or is measurable but not summable. 
But if f(z, y) is non-negative on 4, the converse holds, and we have 
the following theorem. 

THEOREM. If f(x, y) is measurable and non-negative on the interval 
A, the existence of the iterated integral on the right-hand side of (144) 
implies that f(x, y) is summable on A, so that Fubini’s theorem holds 
for it. 

Suppose that the iterated integral on the right of (144) has a 
meaning, i.e. function (143), summable on [a, b], exists almost every- 


’ 


68] FUBINI'S THEOREM 195 
where as regards x on [a,b]. We introduce the functions 


_ [Fæ y), if flay) <n 
Ven} ={i" ean ee 


They are bounded and measurable, and form a non-decreasing 
sequence which tends to f(z, y). They are obviously summable on 4, 
and Fubini’s theorem holds for them. We can write 


b d 
SS [f@ nda dy = f [ SU (2, y)]n dy] dz, 
but [f(z, y)]n < f(z, y), so that 


b 
SS Væ] drdy < h(x) dz, 
4 a 


whence it follows [50] that f(z, y) is summable on 4. 

COROLLARY 1. If f(x, y) changes sign, but the right-hand side of 
(144) exists for | f(x, y)|, then | f(z, y) | is summable by virtue of 
the theorem, so that f(x,y) is also summable on 4, and Fubini’s 
theorem is applicable to it. 

COROLLARY 2. If f(x, y) is measurable on 4 and summable with 
respect to y for almost all x, the h(x) defined by (143) is a measurable 
function. As usual, we can assume f(x,y) non-negative. We have 
Fubini’s theorem for [f(z, y)], (which is bounded), and 


d 


is measurable. On letting n tend to infinity, we see that the limit 
function h(x) is measurable. 

We must note some simple generalizations of the statement of 
Fubini’s theorem. If f(x, y) is summable on the measurable bounded 
set Z, we have 


SSF (a, y) da dy = f [ f (x,y) dy] dx = f [ f f(x,y) da] dy, (154) 
é Bz čz By čy 


where @, is the set of points of & having a given abscissa x, and 8, 
is the analogous set for y, whilst B, and B, are the projections of 
Z on the X and Y axes. The integrals over %, and Z, may not have 
a meaning for values of x and y forming a set of measure zero. To prove 
(154), it is sufficient to cover & by a finite interval 4 and construct 
a function f,(z, y), equal to f(x, y) at points of 8 and to zero at all 


196 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [68 


points of A not belonging to Z. We now show that Fubini’s theorem 
can be extended to the case of unbounded sets. Let us take the entire 
plane as an example. It is sufficient to consider non-negative functions. 
Thus, let f(x, y) be measurable, non-negative and summable over the 
plane, i.e. the double integral exists: 


Bee 
A = f f f(x,y) dady. (155) 


The function f(x,y) will be summable on any finite interval 
Am[—m <x <m; —n <y <n]. Fubini’s theorem holds on this 
interval, i.e. 

+m+n 
SS f(x,y) dady = f [ f f(x,y) dy] de. 


Amn -m-n 


On the other hand, since f(x, y) is non-negative we have 


Sf (a,y) dxdy < A, 
Ann 
so that 
hmtn 
{[ ff ley) dy] de < A. 
-m-n 


If n increases indefinitely and we use Theorem 4 of [54], we get 


+m+o0 
{ [fF (2 y) dy] da < A. 


—M-—~oo 


If we now let m increase and use the definition of the integral 
over an infinite straight line, we arrive at the inequality 


SO (x, y) dy] da < A. (156) 


—c— co 


Finally, we show that the < sign cannot hold. If it does, there 
must exist a positive a such that 


TTS fey) dyjdz < Aa 


—co— %0 


and all the more 
+ntn 
SS f(x,y) de dy = f [ f f(z, y)dy] dx < A—a, 
Ann sjn 


which is absurd, since the integral over Ann must tend to integral 
(155) on indefinite increase of n. We must therefore have the = sign 


68] FUBINI’S THEOREM 197 


in (156), and, on comparing with (155), we in fact obtain for the whole. 
plane the formula that appears in Fubini’s theorem. It obviously 
follows from our proof that the iterated integral appearing in this 
formula exists. 

Fubini’s theorem can also be stated for integrals of any multiplicity. 
The result is as follows. Let Amn be an interval in space Em4n» 
having (m +- n) dimensions, defined by the inequalities 


< by; ay < T < bz; -- -3 m+n < mtn < bm+n, 


whilst 4, and 4, are intervals in spaces Em and R,, defined by 


m:a <S T <S dy; Ay KH, <S bz; -> jam S Em Sbm; 


A, Amt < Tmt < bmt; 2293 Omtn < mtn < bmn- 


Further, let {(P) be a function summable over Am+n. If we fix a 
point P,(2}, £9, .-., £h) of Am, f(P) will be a summable function in 
An for any choice of Po, except possibly for a set of points P, which. 
has measure zero in Rm. The integral of f(P) over 4n: 


h (Es a -os Em) = S f (P) AG mgs +++ Oman 
dn 


gives a summable function in Am, and the formula holds: 


JEP) dz; dz, . des, = MESA dE mt dEms dental X 
ii sa X dæ, dzz... dm- (157) 


Fubini’s theorem also admits of simple generalization to the case 
of Lebesgue-Stieltjes integrals. Suppose we have two increasing and 
bounded functions g(x) and k(y). By using these functions, we can 
define measures G(4) and K(4) of semi-open intervals and then extend 
the functions in question to the closed fields Zg and Ly. We thus 
obtain additive, non-negative and normal functions G(#) and K(@) 
on Lg and Lx. Similarly, by starting from the function g(x) k(y),. 
defined on the plane, we can form an additive, non-negative and 
normal function M(@) on some closed field Lm of sets on the plane. 
If f(P) = f(z, y) is measurable with respect to M(%) and summable 
over some interval 4 of the plane, this function is summable with 
respect to y over the interval A, of the Y axis, corresponding to the 
interval A of the plane, with respect to the function K(@): 


= f f (x, y) K (dé) = Jf @u ak), 
dy 


198 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [69 


if we exclude from the interval 4, of the X axis, corresponding to 
the interval A of the plane, some set of values having measure zero 
with respect to G(@). The function h(x) is summable on 4, with respect 
to G(@), and the formula holds: 


SUF (P) M (a8) = f f Fie, y) dd [g (x) k a] = 
4 4 
= f [Jf (x. y) dk (y)] dg (2). 
x Ay 


Another similar formula is obtained by changing the order of 
integration. The proof of this generalization of Fubini’s theorem 
is precisely the same as that of the basic theorem, except that the 
Lebesgue integral must be replaced everywhere by Lebesgue-—Stieltjes 
integrals, and measurability in the Lebesgue sense by measurability 
with respect to the functions G(@), K(@) and M(8). 

Note. If we are only given that the function f(x, y) is measurable 
on an interval 4 of the plane, it follows from this that it is measurable 
with respect to y on [c, d] for almost all x of [a, b], and is measurable 
with respect to x on [a,b] for almost all y of [c,d]. This remark 
follows at once from the remark that we made after the proofs of 
Lemmas 1 and 2, and from the later proof of Fubini’s theorem. 


69. Change of the order of integration. Another theorem may be mentioned, 
on changing the order of integration. 

THEOREM. Let the function g(x, t) be summable with respect to t over the interval 
[c, d] for all x of the interval (a, b], and be of bounded variation with respect to 
æ on this interval for all t of [c,d], except possibly for a set of t of Lebesgue mea- 
sure zero. Further, let the total variation of g(x, t) with respect to x on [a, b] for 
all the t in question not exceed some non-negative function F(t), measurable on 
[c, d], for which the integral exists : 


d 
§ F (t) de. (158) 
c 
Now, the function 


d 
fg (a, t) di (159) 
e 


is of bounded variation in x of [a, b], and we have, for any function f(x), continuous 
in [a, b]: 


d b b d 
S [SF (x) dyg (x, t)] dt = $ f (œ) d, [ f g (x, t) de], (160) 


the integrals with respect to t being Lebesgue integrals. 


69] CHANGE OF THE ORDER OF INTEGRATION 199 


To prove that (159) is a function of bounded variation, we divide [a,b] 
by a = £g < 2y < Tz <... < Ln < Tn =b and form the sum ¢, [8] for 
this subdivision. We get 


d 
S [9 (Ek) — 9 (er t)] at 


c 





=% 


k=! 





n d d 
t = PA f g (£k t) dt — Í g (£k-1> t) dt 
k=lle c 


whence 
d n 


ta < f X |g (tet) —g (£k t) | dt. 
c k=1 


But, by hypothesis, 


n 
2, |g (tpt) ~ 9 (ty t) | < F(t), 


so that 
d 


ty < | F (t) dt, 
c 
whence it follows that (159) is a function of bounded variation. 
We write the obvious equation: 


a 


TM= 


f 


Ri (Èk) [g (Th t) — 9 (£k-1» t)] dt = 


n d d 
= 21€ [J g (Zp t) dt — 3 g (yr, t) dt], (161) 


where ¢, is a point of [£k z]. On indefinite subdivision, the right-hand side 
of (161) tends to the integral on the right of (160). Since f(z) is continuous in 
[a, b], we have | f(x) | < L, where L is a positive number. We have for the 
integrand of the integral on the left-hand side of (161): 


a n 
PR (&) [9 (2m) 9 (24-1 8] < LŽ [9 (£e t)—g (£k-1t)] | < LF (6). 





On applying Theorem 1 of [54], we see that we can pass to the limit on 
indefinite subdivision under the integral sign in the integral on the left-hand 
side of (161), where the integrand in question gives in the limit the Stieltjes 
integral: 


b 
Sf (x) dyg (a, t). 
a 


Finally, passage to the limit in (161) leads to (160). This theorem admits 
of some elementary generalizations. For instance, we can assume [a, b) infinite 
and f(z) continuous inside this interval and bounded. The Lebesgue integral 
with respect to ¢ can be replaced by the Lebesgue—Stieltjes integral. The original 
Stieltjes integral with respect to g(x, t) can be replaced by a general Stieltjes 


200 SET FUNCTIONS AND THE LEBESGUE INTEGRAL (70 


integral and f(x) can be assumed merely bounded on [a,b]. In this case, the 
existence of the integral on the right of (160) implies the existence of the 
integral on the left-hand side, and the equality of these two integrals. 


70. Continuity in the mean. We return to Lp (p > 1) and show that 
every function of L, is continuous, if we take account of the increment 
in its norm in Lp. We take the case of a bounded measurable set @. 
We take the integrals in the Lebesgue sense and consider the plane 
case for definiteness. 

THEOREM. If f(x, y) € Lp on a bounded measurable set E, given any 
e€ > 0, there exists an n > 0 such that 


|f (@ +h y +k) — flay) P= [1f (+ h, y +k) — f(z, y) |? dedy < 
é 
<e?, if |A| and |k| <7. (162) 


The point (2 + h, y + k) may no longer belong to @,s0 we continue 
f(z, y) outside & by zero. The norm subscript indicates the set with 
respect to which the norm is taken. We can include @ in a finite 
closed interval A, (a, <2 <b; a, < Y < b). By Theorem 1 of [60], 
there exists a ¢(z, y), continuous in 4o, such that || f — o Ila, < £/4. 
We can continue p(x, y) on to a wider interval whilst preserving its 
continuity, say on to 4, (a, — 1 < x < b, + l;a& — 1 < y < b, + 1). 
We write f(z + h, y + k) — f(z, y) as 


fæ + h,y + k)— fley) =f hyt k) oplet h y+ k) + 
Holz -+h y+ k)— elz, y) + g(r, y) — Fz y). 
We have 
|f (æ + hy + k) — f(z, y) lla < lF (£ + hy + k) -- p(x +h y+ 
+ k) lla + Ilp (+ h, y+ k) — glz, y) lla + lle (ey) — ay) | 





da’ 


We had || glz, y) — f(x, y) lla < £/4. In view of the uniform 
continuity of g(x, y) in 4, there exists an 7, > 0 (we assume 7, < 1) 
such that || plx + h, y + k) — g(x, y) lla, < €/4 for | h| and |k] < 
< M, so that 


Z (æ + k, y + k) — f (2, y) llas < 
<$ ++I +h y+ k) olst hyt k) a (163) 
for jh, (k| <n. 


71] MEAN FUNCTIONS 201 
We have 
Z@+ k, y +k) — pls +h, y+ k |l = 
a (€ + k, y + k) — p(z + h, y + k) |? dedy, 
or 


Z (æ +h, y + k)— glet h y+ kll, = [|F (2,y)— p (x, y) P dædy, 
A (h,k) 

where 4,(A, k) is the interval obtained from 4, by parallel displace- 

ment along the vector (h, k). On taking say h and k > 0, we get 


|f (£ +h, y + k) — p (£ + h, y + k) |B, = 
= || Z (x, y) — p (x, y) IIB + flp (x, y) P dady + f |p (x, y) P dzdy, 


a +hSxŠb +h, bW SxSb +h 
bSySh tk a+ kX YK +k 
where J’ is part of 4, so that || f(z, y) — plx, y) ||4, < £/4. The last 
two integrals obviously tend to zero as h and k— 0, so that there exists 
an 7, > 0 such that 


E 


|f (£ + hy + k)—o (£ +h, y+ k) lla <- for |A| and |k| <n 
where 7 = min (%,, 7). We obtain from (163): || f(x + h, y + k) — 
— f(x, y) ||la, < £ for | h |and |k | < n, and so all the more || f(x + h, 
y + k) — f(x, y) Ilg < £ for |h | and |k | < n, which is what we set 
out to prove. 

The theorem can also be proved for the case of an unbounded 
measurable set. 


71. Mean functions. An averaging process can be introduced for 
any summable function f(P). It leads us to a sequence of functions 
which have derivatives of all orders and tend to f(P) in the usual 
sense. We shall take for definiteness the case of a plane and use the 
coordinates of points instead of the points themselves. Let w(P, Q) = 
= w,(z, Y; & n) be a function depending only on the distance 


r = |PQ| = Y (a — £ + (y— 0), 





equal to zero for 7 > 1, continuous and having continuous derivatives 
of all orders with respect to all four coordinates. Notice here that 
differentiation of w,(P,Q) with respect to x and y can be replaced 
by differentiation with respect to and y when the sign is changed 


202 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [71 
in the result. Suppose further that 
fœ (x,y; é n) da dy = 1. (164) 


We do not write the domain of integration, on the assumption 
that the integral is taken over the whole plane. By what has been 
said, the integration is in fact performed over the circle (2 — &)? + 
+ (y — n}? < 1. If the integration with respect to (x, y) in (164) is 
replaced by integration with respect to (¢, 4), the result will be 
the same. 

We now introduce the following notation: 


wlz yin =o(2,%;=,2)e>0). (165) 


The function œ, has the same differential properties as w,, but 
®, = 0 for r > ọ, and 


f w, (a, y; E n) da dy = œ. (166) 
We shall indicate later a possible choice for w,(P,Q) [cf. IV; 157]. 


We shall in future use C,(&, n) to denote the circle with centre (&, n) 
and radius g. We further introduce the notation: 


whence 
flo, (x,y; £, n) | dady = Ce. (168) 


Suppose we have the summable function f(x, y) in the bounded 
open or closed domain D, (instead of a domain we could take any 
bounded measurable set). We continue it by zero on to the whole of 
the plane and form the mean function from it: 


fe (En) = Se | F (sy) œe (2, 9; £, n) de dy. (169) 


The positive number ọ is usually called the averaging radius. 

The integrand is zero outside the circle C,(é, n), and, if the distance 
d from the point (£, 4) to D, is greater than zero, we have f,(£, n) = 0 
for 9 < d. 

THEOREM 1. The function f,(£, n) is continuous and has continuous 
partial derivatives of any order throughout the plane. 

Since w, depends only on the differences v — § and y — 7, we have 


<- fire) loz hy — hE.) — (x,y; En) [dady. (170) 


71] MEAN FUNCTIONS 203 


It follows from the uniform continuity of œ, as a function of x, y, 
E, ņ that, given any £ > 0, there exists an 7 > 0 such that 


|o (£ — h, y — k; $n) — w, (£y &,)| <e for |h! and |k| <7 


and consequently, 


Ife (E+ hoy +b) — hEm] < se fI ey ldedy(|h] and |k] <n) 


whence, since e is arbitrary, it follows that f (£, 7) is continuous. 
We now prove the existence and continuity of the derivative with 
respect to &: 

eas fe (Sm) __ 


— h, ys , n) — „Y; È, 
l f(a, p Eni ele yE de dy. (171) 
By the mean value theorem: 
—h, 35> pa 2 Y3S> 0 
Lela — hs yig, n) — wele yë n) aima Em ay Oe — Oh, y; ën) (0 <0 <1). 


The absolute value of the right-hand side does not exceed some 
number K for any h, and the integrand in (171) has an absolute value 
not exceeding the summable function K | f(x, y) |, so that we can 
pass to the limit under the integral sign, and obtain 


ə 1 BW (L, y; È, 
AGA = — fre, y MBE” dedy = 
=4 fiey ) 90 eee) dz dy. 


The continuity of this partial derivative can be proved in precisely 
the same way as the continuity of f ($, n). The above proof is also 
applicable to the further derivatives, and they are obtained by 
differentiation under the integral sign: 


Ok fo (È, n) y Seales y; $, n) 
DEP 8 7A =p fiey opo I ay. ca 





THEOREM 2. Jf f(x, y) € £,(D)), ie. L, on Dy, then 
$ Ife (x,y) Pdady <0, $ | f(x,y) |? dz dy, (173) 
where C, is a constant, and 


nm lf — fel? = lim SIF (2Y) — fe (£y) dedy =0. (174) 


204 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [71 


We have 
Ie lE Ml < se] VT Em] V Toe EvE] ey) dedy. 
We apply Buniakowski’s inequality: 
lf, (é) |? < > fi w, (x,y; E n) |dx dy: flre y)ell ozy; En) |da dy. 
(175) 
On using (168), integrating with respect to (, 4) and changing 


the order of integration [69], we obtain 


[fe (£n) |? dë dn < o fiz (æy) [flo (£, y; £, n) |d dy | de dy, 


whence, by (168), we obtain (173) (C, =C?), where the integration is 
carried out on the right over D, since f(x, y) = 0 outside Do. 

We turn to the proof of (174). On taking (166) and (169) into 
account, we can write 


lf — fell? = SIF 6 n) aig |d dy = 
=f]. (re En) —f (2, Y)] @, (a, y; E n) dedy | dedy. (176) 
We replace (x, y) in the inner integral by new variables of integration 
(u, v), in accordance with x = + u; y = n +v, and on observing 
that œE + u, n +o; & n) =0 for u+ v> go, we obtain by 
applying Buniakowski’s inequality: 
IS [F(E n) — f (2, y)] we (2, y; £, n) dx dy |? < 
< f |f(&) — E+ un + 0)? dude x 


miese 
x f (E+ 4,9 +v t, n) dudv. 
utils? 


But w, < 0, (O, is a constant, independent of pọ), so that the right- 
hand side does not exceed 


Cao? § | f (én) — iE 49+ v) P dude. 


ut+ DE 


As a result, (176) gives 


t-te ES | Fenr + 41+ 0) Pande] dsan. (177) 


u'e" e? 


{lj MEAN FUNCTIONS 205 


We change the order of integration: 
ffl? < = f [fi Flen — lEt uno) dé dn | du dv. (178) 
utpote 


In view of the continuity in the mean, given any eœ 0 there 
exists an 7 > 0 such that 


SIEGE n) —f(E +40 +») Pdédyn < e, if u? + o< ne. 
Inequality n now gives 





lF — fel? < om iol dudv=C,2%e for 0<e<7, 
ut+e*<o! 

which leads to (174), since e is arbitrary. 

We shall prove a further theorem, which will be needed later. 

THEOREM 3. Let U be a set of functions f(x, y) of LAD), bounded 
in norm by the same number. Given a fixed 0, the corresponding functions 
f(z, y) are now bounded in modulus by the same number and are aqui- 
continuous. 

By hypothesis, there exists a positive number m such that, for all 
functions f(z, y) of U: 


Sl f(z, y) Pdxdy < m. (179) 
D, 


On applying Buniakowski’s inequality to the right-hand side of 
(169), we get 


Ibë ml< =e foen &n)dady. fify) 


It remains to prove the equi-continuity. We apply Buniakowski’s 
inequality to the right-hand side of (170) and make use of (179): 


lh lE +h, n +k) — hlé n| < 
< TF flo- hy — ks &, n) — wg (X, Y; é, n) |? da dy, 





and it follows from the uniform continuity of œ, that the /,(&, n) 
are equi-continuous for any choice of f(z, y) of U. 

The above proofs are applicable for Lp with p > 1. If p > 1, we 
have to use the Holder instead of the Buniakowski inequality, and 


the formula 
1 


1 1 
|w (£, Y; E n) | =|, (2, Y; £ n) |P l0, (x, ys E m| (>+ : = 1). 





206 SET FUNCTIONS AND THE LEBESGUE INTEGRAL (71 


When p = 1, the proof is as above, without the use of the Bunia- 
kowski and Holder inequalities. The following theorem therefore 
holds. 

THEOREM 2’. If f(x,y) € L,(Dy) (p > 1), then 


Vl fp (x, y) |Pdady < C, {| f(x,y) |P dæ dy; (180) 


lim | f(x, y) — fy (2, y) |P de dy = 0. (181) 
= 


Theorem 3 also holds, with L, replaced by Lp. The proof also 
remains in force when D, is an unbounded measurable set, e.g. the 
entire plane 2... By using Theorem 3, the lineal of continuous finite 
functions with continuous derivatives of all orders is easily shown to 
be everywhere dense in L,(&..). 

A further point: everything said above holds both for the real and 
the complex space Lp. We have discussed mean functions on the 
assumption that f(z, y) € Lp (p > 1). We now suppose that f(x, y) is 
continuous in the open domain D, and let D’ be any fixed closed 
domain lying inside Dy. By virtue of the uniform continuity of f(z, y) 
in any closed domain lying inside D,, given an £ > 0, there exists 
an > 0 such that | f(x,y) —f(&, n) | <e if (x, y) € Cë, n) for 
e < 7 and (ë, 7) is any point of D’. Since 


IEE n) — feln) | < E n) — f (x, y) | |w, (x, Y; &, )|dady (182) 
we have 


IEE n) — hlin KEC for e<nqn and (n) ED’. 


If, in addition, f(x, y) has continuous derivatives up to any order 
in Dg, on replacing differentiation with respect to € and 7 by differen- 
tiation with respect to x and y in (172), carrying out the integration 
by parts and using the properties of œ, we obtain for (&, n) € D’ 
and sufficiently small 9: 





Ək felén) _ 1 f OF F (a, y) : 
séra oF am oyt Pel y; £, n) da dy, (183) 


i.e. the derivative of the mean function is equal to the mean function of 
the derivative for (£, n) € D’ and sufficiently small o. 


71) MEAN FUNCTIONS 207 


Notice that, given the conditions mentioned, @,(z, y; &, 7) is 
equal to zero close to the boundary of D, and on account of this, the 
line integrals vanish when integrating by parts. We can choose for 
the paths of integration in these integrals sufficiently smooth curves 
lying sufficiently close to the boundary of D. 

It follows from what has been said that: 

THEOREM 4. If f(x, y) is continuou along with its derivatives up to 
some order l inside Dy, in any closed domain D’ lying inside D, the 
mean function and its derivatives up toorder l tend uniformly to f(x, y) 
and the corresponding derivatives as 9 —> 0. 

It may be remarked further that, if fx, y) is bounded:| f(z, y) | < m, 
then 


lhl n| < Z fie. y)| |o (x, Y; & n|dady<Cm. (184) 


THEOREM 5. If F(x, y) is summable over any closed domain D’ lying 
inside Dy, and has the property that 


f F (x, y) p(x, y) dæ dy = 0 (185) 
Da 


for any choice of a continuous (x, y) with continuous derivatives up 
to some order l inside D, and vanishing outside some closed domain 
lying inside Dy, then F is equivalent to zero. 

It is sufficient to show that the hypotheses imply that (185) holds 
for any bounded measurable finite ¢(x, y), i.e. equal to zero outside 
some closed domain lying inside D, (as in the hypotheses, this domain 
may be different for different p(x, y)). After this, the proof that 
F is equivalent to zero is precisely the same as the proof of Theorem 
12 of [52]. Let y(z, y) be such a function, where | p(x, y) | < m and 
P(x, Y) is the mean function, so that | p (x, y) | < Cm. The function 
P(x, Y) is finite for sufficiently small ọ and has continuous derivatives 
of all orders, so that, by hypothesis, 


f E (x, y) p, (2, y) dx dy = 0. (186) 
D, 

The functions ,(z, y) are convergent as ọ—> 0 in L, to g(x, y) 
on D,, and a sequence ¢, (x, y) exists, which tends to g(x, y) almost 
everywhere. In addition, | F(z, y) p(x, y) | < On | F(z, y) |, where 
F(z, y) p(z, y) are finite and we have a summable function on the 
right. On passing to the limit in (186) with @ = ọn, we get (185), 
and the theorem is proved. 


208 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [71 


We must now mention one of the possibilities for choosing the 
function w,(z, y; £, n); in fact, we put [cf. IV; 157]: 


2 

C, i for r<], 
w (£, Y; é n) = (187) 

0 for rèl, 

(7? = (x — &P? + (y — n?) 
where the constant C, is chosen so that condition (164) is fulfilled. 
The presence of continuous derivatives of all orders for r < l and 
r > 1 is obvious. As r— 1 from smaller values, e”'’-1)->0. It is 
easily shown by induction that the derivative of any order has the 
form, with r < 1: 


r? 7 
a aoe | r 
ane am (@— by —n) mi 


oann (r? — 124 Soig 





where p;m(u, v) is a polynomial. As the point (¢, 7) approaches the 
circumference r = 1, this expression tends to zero. On using the 
finite increments formula, we find that the derivative exists at any 
point of this circumference and is equal to zero. In our example 
(187), œ, > 0 for r < 0, and the constant C of (167) is equal to unity. 


CHAPTER III 


SET FUNCTIONS. ABSOLUTE 
CONTINUITY. GENERALIZATION 
OF THE INTEGRAL 


72. Additive set functions. Let f(P) be a point function measurable 
with respect to a non-negative, additive and normal function G(@). 
We form the indefinite integral 


p (2) = f f(P)G (de). (1) 
g 


It is defined for all the sets Z, belonging to the closed field Lg, 
on which f(P) is summable. Here, if f(P) is summable on @, it is 
also summable on any measurable part %’ of Z, and if Z is split 
into a finite or denumerable number of disjoint sets p, then 9(2) 
is equal to the sum 9(@,) (it is completely additive).We shall next 
consider the properties of completely additive functions given in any 
manner, and not necessarily as an indefinite integral. Thus, let y(Z%) take 
finite real values for sets belonging to some family C of sets of some clos- 
ed field of sets T, containing all closed and open sets. We assume here 
that, if belongs to C, every part of & belonging to T also belongs 
to O. Moreover, we assume that 9(%) is completely additive, i.e. if & 
belonging to C is split into disjoint sets Zp, the number of which is 
finite or denumerable, where all the g, belong to T and hence to O: 


g = DF, (2) 
k 
then 
p (2) = So (F)- (3) 
k 


When the number of terms is infinite, the series written must be 
absolutely convergent. In the case of (1), the closed field T is the 
field Lg, and the family C consists of all the sets Z of Lg on which 


209 


210 SET FUNCTIONS. ABSOLUTE CONTINUITY (72 


/(P) is summable. The most important case for what follows is that 
when C consists of some @, belonging to Lg and the sets of Lg 
which belong to @. In this case C itself is obviously a closed field. 

Notice that, if g(#) is defined for sets of T' belonging say to some 
closed interval A, it can be defined for all sets J of T by the formula 


g (2) = g (GA). (4) 


Here, it will take finite values and will be completely additive on 
the whole of the field 7. In future, when we speak of ¢(%), we shall 
naturally assume that Z € C. In view of the additivity, we must 
have 9(@) = 0 if Z is the empty set. It follows at once from the 
additivity that, if Z” and &” belong to C and Z’ c 8”, then 


p (&" — &) = p (&") — p (B’). (5) 


Further, it follows from the complete additivity that, if 2%, 
(n = 1, 2, ...) is a monotonic sequence of sets of C and if the limit 
set & also belongs to C, then 9(%,) —> ¢(&). In the case of a non- 
decreasing sequence of sets we have Z = g, + (Z, — Z) + (F, — 
— č.) +... and, in view of the complete additivity, ¢(%) = 
= 9(F,) + [p(F2) — pE] + oE) — og) + ---, Le. (Fn) > 
—> 9(%). In the case of a non-increasing sequence of @,, the proof is 
similar, being necessarily of C. Notice further that a finite linear 
combination of completely additive functions: c,p,(%) + c.(%) + 
+... + c¢pp,(%), is obviously also completely additive. We turn to 
the proof of the theorems fundamental to the theory. 

THEOREM 1. The absolute value of o(%) is always bounded by the 
same number, whatever the set E belonging to any set E of C. 

We use reductio ad absurdum. If this is not the case, there exists 
an Z, c such that | o(%,) | > 2 and |(g, — Z) > 2. We have 
to take into account here that 


Y (21) = p (Z2) +  (F, — &,), 


and that the fact of one term being unbounded implies that the other 
is unbounded, since g(%,) is a given number. The theorem is not 
fulfilled for Z, or Z, — %,. We can assume that it is not satisfied 
for Z, and there exists an 8, C Z, such that | 9(%;)| > 3 and 
| o(%, — Za) | > 3 and so on. We have: J, c Z, c Za... and, on 
writing Z = F,%,..., by what has been said, 9(%,) > 9(%), which 
is absurd, since ọ(& n) is indefinitely increasing in absolute value. Thus 
the theorem is proved. 


72) ADDITIVE SET FUNCTIONS 211 


Let ô be some subdivision of g into a finite number of Zp. We form 
the sum 
= 319 (8) | (6) 
and show that the set of values of t, is bounded for any ô. Let &; 
be the sum of the &, for which 9(%,) >0, and #3 the sum of the 
&, for which o(¥%,) < 0. Since o(#) additive, we can write 


ty = p (Zs) — p (Fo). (7) 


On also observing that Z; + Z; = Z, so that o(%) = o(%5) + 
+ o(%5), we can rewrite (7) as 


ty = 29 (Z3) — p (Z) = p (@) — 29 (85). (8) 


We write 9(%) and oF ) for the strict upper and lower bounds of 
gle) if e c g, the empty set being also assumed to belong to @: 


p (Z) =supy(e); g(Z)=infp(e); (ecg). (9) 


By Theorem 1, we can say that ¢(@) and 9(@) are finite. It follows 
from the first of formulae (8) that ¢ is bounded for any choice of 
ô : ta < 29(%) — o(%). The strict upper bound of the sums ¢, for 
all possible subdivisions ô is called the total variation of 9(#) on 
the set Z. We write it as 9(@). If ôn is a sequence of subdivisions 
such that f tends to g(@), it follows from the first of formulae (8) 
that 9(%3,) now tends to ¢(%), whilst the second of formulae (8), 
which can be rewritten as 


29 (&3,) = (2) E ts,» 
shows that p(83,) > (2), so that (8) gives in the limit, with 6 replaced 
by ôn: 
® (Z) = 29 (8) — p (Z) = p (F) — 29 (@), 


whence 
P2) = FPF) +9); oF) =—F(OS)— p (2), 
(3) =5(%) — g (2), (10) 
(8) =F (8) + 9 (8) =F (8) —[— 98). (11) 


It follows from the definition of ¢(&) and 9(%) that (Z) > 0 and 
ọ(Z) <0. The functions 9(%) and —9(%) are usually called the 
positive and negative variations of g(@) on Z. 


212 SET FUNCTIONS. ABSOLUTE CONTINUITY [72 


THEOREM 2. The positive, negative and total variations are completely 
additive functions on C. 

Given any subdivision of & into &,, we form the series with non- 
negative terms: 


S= 57 (2), 
k 


and show that S = 9(&). First of all, we show that S < +æ. For, 
if we had S = +00, given a suitable choice of ep C Ep, the sum 


D P(e) 


would take as large a value as desired. 
But this sum is equal to y(e), where 


eat (ec é), 


and we have arrived at a contradiction with the assertion of Theorem 1. 
Given £ > 0, we have g(e) > S — e for a suitable choice of ep, so 
that all the more 9(%) > S — e, whence, since e is arbitrary, p(F) > S. 
Let us prove the reverse inequality. We choose e c @ so that g(e) > 
> HF) — z; let ex = eF,. We have 


g (e) = Spex), Le. e(F)—e< SH lex), 
k k 


whence all the more: 


p (2) —éS > ¥ (Fr), 
Kk 
and, since € is arbitrary, 9%) < > p(Z), and finally 
k 
P) = 572). 
k 


Similarly, the negative variation is completely additive, so that the 
same is true of the total variation, by (10). Equation (11) shows that 
every completely additive function is the difference between com- 
pletely additive non-negative functions. Notice further that, if we 
had used a subdivision of & into an infinite number of sets to form 
the sums (6), the previous strict upper bound would have been 
obtained. 

Any point (closed set) belongs to T. If Z € C, any point P of g 
belongs to O, and we can speak of the value ¢(P) of the function 
g(&) at the point P. If p(P) # 0, P is called a point of discontinuity 
of o(&). Otherwise, it is a point of continuity of g(%). If p(P) > 0, 


73] SINGULAR FUNCTION 213 


it follows from the definitions given above that 9(P) = 9(P) and 
g(P) =0, whilst if o(P) <0, then ¢(P)=0 and o(P) = ¢(P). 
If »(%) is continuous at P, (Z) and g(#) are also continuous at P. 
Since (Z) and (Z) are finite, there is a finite number of points of 
discontinuity belonging to Z and such that 9(P) > a or ọ(P) < —a, 
where a is a given positive number; also, the number of all the points 
of discontinuity is finite or denumerable. Let these points be Pp 
If the set of P; is denumerable, the series formed from the ọ(Px) is 
absolutely convergent. We introduce a new set function, defined on 
the family C: 

pa) = > (P), (12) 

Pree 

where the summation is over the points P, of Z. This function is 
also completely additive. It is called the jump function. The difference 


P, (2) = g (2) — p4 (2) (13) 


is a completely additive function with no points of discontinuity 


73. Singular function. In future, the field 7 will be the field Lg. 
As a matter of fact, not every function y(@), completely additive on 
a family C of Lg, can be written as an integral (1). 

We shall prove later the following fundamental theorem, which 
we shall shortly make use of. 

THEOREM. Every function g(f), completely additive on C, can be 
expressed for all sets Z belonging to any fixed set E o of C by the formula 


p (2) = p (ZH) + f f(P) G (ds), (14) 
č 


where H is a definite set of F , such that G(H) = 0, and f(P) is measurable 
and summable on &,. The term ọ(8 H) is called the singular part of 
g(&). The singular part is defined by the values of ọ(f) on sets of 
measure zero. The second term, which we call the absolutely con- 
tinuous part, vanishes on any set of measure zero. We now show that 
the expression as the sum of a singular and absolutely continuous 
part is unique. Suppose we have, for & of C that belong to o, in addition 
to (14): 
(7) = p(FH,) + SAP) G (dé), 
č 


where G(H,) = 0. We have from this formula and (14): 
P (ZH) — 9 (@H,) = f f (P) G (d8) — f j (P) G (d8). 
ë č 


214 SET FUNCTIONS. ABSOLUTE CONTINUITY [73 


We replace Z by the set ZH + ZH, belonging to & ,. On observing 
that G(ZH + ZH,) = 0, so that the integral over ZH + ZH, is 
zero, and that (ZH + ZH,)H = ZH and (ZH + ZH,)H, = ZB, 
we get o(fH) = 9(FH,), whence it follows that the absolutely cont- 
inuous parts must be the same, i.e. 


f E(P) G (d8) = f fi (P) G (de). (15) 
d é 


To prove the theorem, we start from an arbitrary but fixed set 
Z, belonging to C and assume that all Z c Zo, as is stated in the 
theorem. When decomposing ¢(@) into a singular and an absolutely 
continuous part, we started from a set Z, and assumed that 
the whole of Z belonged to Z. We thereby obtained a unique decom- 
position. If we had started from some other set Z6, different from 
Zo and belonging to C, the earlier original decomposition would 
evidently have been obtained for all sets belonging simultaneously 
to &, and é. For, we should otherwise obtain two different de- 
compositions of y(#) for sets belonging to the product 35 = E Z6, 
which also appears in the family C, and this is impossible, as we 
have seen above. 

We can say, in the sense indicated, that the decomposition of 9(2) 
into a singular and an absolute continuous part is unique in the 
whole of the family C. 

Let us show that the /(P) appearing in the integrand in (14) is 
well defined, on the assumption that functions equivalent with 
respect to G(#) are identified in the usual way. We have to show 
that, if (15) holds for all Z that belong to Z, p(P) =f,(P) — f(P) 
is equivalent to zero on č p 

Let ZJ be the part of Z, where y(P) >0, and J, = %,— Zt. 
The sets Zf and #5 belong to C, and we have, on replacing & in (15) 
by Zand 5: 

f v (P)G (d2) = f y(P) G (dB) = 0, 

gb fo 
whence it follows that (P) is equivalent to zero on Zf and Z5, 
and hence on & 4. If we form the function f(P) for two sets 7, and &% 
of C, these two functions will be equivalent on #5 = 8 ĝo as above. 
In this sense, we can speak of the uniqueness of the function f(P). 
If, for instance, all finite intervals belong to C, on applying the 
foregoing arguments to the widening intervals —n < x <+ n;—n < 
<y< +n (n=1,2,...), we define f(P) uniquely on the whole of 
the plane. The function f(P) is usually called the derivative of 9(Z) 


73) SINGULAR FUNCTION 216 


with respect to G(#). Let k® be a circle (or sphere) with centre P 
and radius e. It can be shown that, for all P, excepting possibly a 
set of measure zero with respect to G(Z), the ratio o(k)/G(kKe) 
tends to a function equivalent to f(P) with respect to G(@) as € tends 
to zero. Obviously, it is assumed here that y(#) is defined on the 
circle k$ for sufficiently small e. We shall make no future use of 
this assertion and omit the proof. 

DEFINITION. A function (g) is said to be absolutely continuous 
with respect to G(&) if, given any fixed Z, of O and any e > 0, there 
exists a positive n such that | g(e)| < € if e € Z, and | Gle)| < 7. 
If g(@) is absolutely continuous with respect to G(%), obviously 
(g) = 0 if Z € Cand G(#) = 0. The second term of (14) is an abso- 
lutely continuous function on C, as we know. Conversely, if we know 
that (&) is absolutely continuous, then 9(2H) = 0, since G(fH) = 0 
and 9(%) is expressible by 


p (Z) = f F (P) G (a2), (16) 
d 


i.e. the singular part is absent. This discussion leads to the following 
corollary of the fundamental theorem. 

COROLLARY. If o(%) = 0 for G(F) = 0, then g(&) is expressible by 
(16) and is absolutely continuous with respect to G(Z) on any set Fy 
of O. 

Notice that, if G(#) is not continuous, the 9(%) given by (16) is 
also in general not continuous. If, for instance, G(P,) = a # 0, then 
p(Po) = af(Po). But o(%) is absolutely continuous with respect to 
Q(g) in the sense indicated above. 

If G(@) is continuous, the g(#) given by (16) is obviously con- 
tinuous. If G(4) is the area of the interval 4, so that Lg is the 
field L of Lebesgue measurable sets, (14) becomes 


p (Z) =p (ZH) + f f f(x,y) dady, 


where H is of Lebesgue measure zero. Equation (16) becomes 


g (2) = f f f(x,y) dz dy, 
é 


and in this case y(@) is obviously continuous at every point. 

The strict upper bound of the values of y(e) for sets e belonging 
to Z is obviously obtained in the case of (16) if we integrate f(P) 
over a set for which f(P) > 0, and the strict lower bound is obtained 


216 SET FUNCTIONS, ABSOLUTE CONTINUITY [74 


if we integrate over a set for which /(P) < 0. We thus have the 
following formula for the positive, negative, and total variations of 
the o(%) given by (16): 


p2) = f f+ (P)GF); y(@) =f f-(P) (dg 
ë E 


= f |/(P)|G (d2). (17) 
A 


If we extract the jump function p4a(&) from ¢(@) and apply decompo- 
sition formula (14) to the remaining continuous function, we get a 
decomposition of ọ(g) into three terms: 


p(B) = pa (F) + pe (ZH) + f fe (P)G (a2). (18) 
E 


74, The case of one variable. Let Z be e.g. the finite interval 
[a,b]. We naturally introduce, instead of G(@), the corresponding 
non-decreasing point function g(x). We introduce, in place of the 
set function the point function w(x) = p([a, x)], and formula (14) 
for it takes the form 

w (x) = g ([a, £] H) + f f(x) dg(z), (19) 
[a,x] 
and for the Lebesgue integrals: 


w (22) = p ([a, z] H) + f f (a) de. (20) 


a 


When 9(@) is absolutely continuous we have the formula 


x 
w(x) = f f(x)dg(z) and (x) = { f(x) (21) 
(a, x] a 

Let us consider in greater detail the case of the Lebesgue integral. 
When passing from an interval function to a point function, constants 
can be added to the latter, and we can write 


w (x) = | f (z) da + w (a). (22) 


where f(x) is a measurable function, summable over [a, b]. Since the 
integral of f(x) is absolutely continuous, function (22) has the follow- 
ing property: given any e > 0, there is a corresponding 7 > 0 such 


74] THE OASE OF ONE VARIABLE 217 


that, if (ay, bx) (k = 1,2, ..., n) are non-overlapping intervals for 
which 


n 


Sy — a) <n, (23) 
k=l 
then 
S [o (,) — @(a,))| < e. (24) 
k=l 


We shall start out from a point function, and say that the 
point function (xz), defined on the interval [a, b], is absolutely 
continuous on this interval if it has the property just indicated. 
Obviously, an absolutely continuous function is simply continuous, 
since we can take in particular n = 1. As we shall see later, there 
exist monotonic continuous point functions that are not absolutely 
continuous. The following property is a consequence of the one 
described: given any € > 0, there is a corresponding y > 0 such that, 
if (23) is fulfilled, then 


Slob) — 0 (a) | <e. (25) 


kał 


In fact, if w(x) has the above property (24), i.e. is absolutely con- 
tinuous, there is an n corresponding to the given e such that, when 
(23) is satisfied, 


n 
Slob- ola]. (26) 
k=1 
We can split any system of intervals (a,, bp) satisfying (23) into 
two classes, where class I contains the intervals for which w(b,;) — 
@(a,) > 0, and class II those for which w(b;,) — w(a,) < 0. We have 
by (26): 


Zlob) — 0 (dy) | = F lo (bx) — o (m)l < F> 
I 
Zlo (Bs) = o (a) |=| Z e (Ba) — o (a) <$ 


whence (25) follows. Since the terms of sum (25) are non-negative 
and n is arbitrary, we obtain the following property of absolutely 
continuous functions: given any «> 0, there is a corresponding 
n > 0 such that, if (ap, by) is a finite or denumerable set of mutually 


218 SET FUNCTIONS. ABSOLUTE CONTINUITY [74 
non-overlapping intervals satisfying the condition 


> (b: — a) <4, (27) 
k 


then 


Xlob-ola)]|<e. (28) 
k 


Conversely, if this condition is satisfied, the original condition (24) 
is all the more satisfied, and w(x) is absolutely continuous. 

THEOREM 1. The sum, difference and product of two absolutely cont- 
inuous functions are absolutely continuous functions. The quotient 
@,(x)/w,(x) of two absolutely continuous functions is also absolutely cont- 
inuous, provided w(x) does not vanish. 

We shall only give a proof for the product w(x) w,(z). The individual 
functions are bounded in [a,b], ie. | œ (x)| <1, and | w,(x) | < l 
We have 


| w; (bx) a (By) — w; (Ax) w (4x) | < 
< |g (by) | | 4 (br) — w (ax) | + | 4 (Ax) | |e (br) — w (Ax) | < 
< 1, |w (b) — w, (ax) |+ 1, | @, (br) — @ (ax) | . 


On summing over & and taking into account the absolute continuity 
of w,(x) and ,(x), property (25) follows for the product. 

THEOREM 2. An absolutely continuous function w(x) is of bounded 
variation, and its total variation v(x) is also an absolutely continuous 
function. 

Let n, be a positive number such that, when (23) is satisfied for 
1” = Ny we have 

| @ (Ox) — o (a) | <1. (29) 
k 


We subdivide [a, b] by fixed points a = cọ < & < ... < Cy, < 
< en =b such that ck — Ck- < No (k = 1, 2, ..., N). Given any 
subdivision of the interval, we have (29) for [c,—,, cx], and the sums 
t» and hence the total variation of w(x) on each of the [cy_,, cx] 
is not greater than unity, and not greater than N on the whole of 
[a, b]. Let 7 correspond to £, so that, when (23) is satisfied, (25) is 
fulfilled. Let us subdivide each of the [a,, by] appearing in condition 
(23). The sum of the lengths of the sub-intervals obtained will satisfy 
condition (23) as before, and sum (25) corresponding to the sub- 
intervals will be < € as before. The strict upper bound of the sums of 


74) THE CASE OF ONE VARIABLE 219 


terms corresponding to the sub-intervals of [a,, bx] obviously gives 
v(b,) — v(a,), and thus we have, when (28) is fulfilled: 


S[vb) —0(a)] <e, 
k=l 


whence it follows that v(x) is absolutely continuous, and the theorem 
is proved. 
By forming the functions 


©, (2) = + [0 (2) + o(s); @, (2) = 5 [o (æ) — o (2)], (30) 


which are not decreasing [8] and are absolutely continuous by Theorem 
1, we can write w(x) as the difference between two non-decreasing 
absolutely continuous functions: 


w (x) = w, (%) — w, (X). (31) 


As we have mentioned, the indefinite integral (22) of the summable 
function f(x) yields an absolutely continuous point function «(z) 
in the sense of the above definition (23), (24) or (23), (25). Let us 
now prove the converse. 

THEOREM 3. Every absolutely continuous function w(x) is expressible 
as the indefinite integral 


w(x) = f f(x) dx + w (a). (32) 


By using the function ,(x) and putting w(x) = ,(a) for s < a, 
and ,(z) = ,(b) for x >b, we can associate the interval A [a, 8] 
with a non-negative number 9,(4) = œ (f) — @,(a), where it is 
unimportant whether 4 be open or closed, in view of the continuity 
of w,(z). If the linear set & is Lebesgue measurable, there exists 
an open set O containing & such that the set (O — Z) can be covered 
by a finite or denumerable number of intervals [a,, bg], the sum of 
the lengths of which is as small as desired [35]. Since w(x) is absolutely 
continuous in [a,b] and is prolonged by a constant outside [a, b], 
we can carry out this covering in such a way that the sum of the 
non-negative terms w,(b,) — w,(a,) for the covering intervals [a,, by] 
is as small as desired, ie. if Z is Lebesgue measurable, it is also 
measurable with respect to w,(z). We can thus carry over 9,(4) to 
all sets of Z that belong to [a, b], with the property of being completely 
additive. It also follows from the above discussion that, if the Lebesgue 


220 SET FUNCTIONS, ABSOLUTE CONTINUITY [74 


measure m(Z) = 0, then 9,(Z) = 0, so that we have 
pı (F) = Sh (x) dz. (33) 
ë 


Similarly, by forming ¢,(4) = œ {f) — w,(a), we get 
p (F) = Jh (x) dz, (34) 


where /,(x) is summable on [a, b] and 


p (2) = p, (F) — p (F) = sth (a) — fa (x)|da = f f(x)dx. (35) 
(a 


If we take the interval [a, x] as Z, we arrive at (22). 

The f(x) appearing in (22) can be shown to be uniquely defined 
apart from an added term which is zero almost everywhere. For, if 
we had a second formula of type (22) for w(x) with the integrand 
g(z), the integral of f(z) — g(x) over any interval belonging to [a, b] 
would be zero, and we could assert, by property 11 of [52], that the 
difference in question is equivalent to zero. The f(x) in (22) is called 
the derivative of w(x) and is usually denoted by f(x) = w’(x). It can 
be shown that, for all x of [a, b] excepting possibly a set of values of 
Lebesgue measure zero, the limit exists: 


. wo(x+h) — (x) 

where F(x) is equivalent to f(z). We omit the proof. If f(x) is a cont- 
inuous function in [a, b], there exists for all x of [a, b] the ordinary 
derivative w’(x) = f(x) of the integral with respect to the upper 
limit. If /’(z) is absolutely as well as just simply continuous in [a, b], 
we obviously have 


w (x) = f (xz) = f h (x)dz +C, 


where h(x) is summable. We have f'(x) = A(x), and h(x) is called 
the second order derivative of w(x) and is written as usual as A(x) = 
= w”(x). Similarly, w(x) can have absolutely continuous derivatives 
up to order k and hence a summable derivative of order (k + 1). 
It is now expressible as 

x 


= {dz faz. .» | w+) (x) da +- w (a) + 


i w (a) k 
+~ (2a) + ee boa (x —a)*. 








75] ABSOLUTELY CONTINUOUS SET FUNCTIONS 221 


All the above theory is readily extended to the case when œ(x) is 
absolutely continuous with respect to the non-decreasing function 
g(x), which we shall assume continuous, i.e. given any e > 0, there 
is a corresponding positive 7 such that, if (ax, bx) are non-overlapping 
intervals, for which 


> |9 (bx) — 9 (an) | < 9, (36) 
k=l 
then 
S lob- ola] <e. (37) 
k=l 


Precisely as above, we can write (28) instead of (37), and œ(x) is 
a continuous function in [a, b]. Instead of (32), we have 


x 


w (x) = f f (x) dg (x) + @ (a). (38) 


a 


75. Absolutely continuous set functions. We return to the general 
case of sets on a plane and consider in more detail the trans- 
formation performed by (16), on the assumption that f(P) is non- 
negative and summable throughout the plane. If f(P) is defined 
and summable on some measurable set Zo, on continuing it by zero 
outside &, we obtain a function summable throughout the plane. 
Formula (16) defines a completely additive function ¢(%) on the 
field Lg. This function is thereby defined for all semi-open intervals, 
and we can continue this function of an interval g(4) on the field 
L,, as we did earlier as regards G(4). 

THEOREM 1. Every set E of Lg belongs to L,, and (16) gives the measure 
g(&) of this set, obtained when (A) is continued as indicated. 

Every open set O is the sum of a denumerable number of non- 
overlapping semi-open intervals 4, (k = 1, 2, ...). On summing both 
sides of 


P (4,) = f f (P) G (d2) 


4, 


over k, we obtain on the left the measure ¢(O) (since the measure is 
additive), and on the right the integral over O (because the integra] 
is additive), i.e. (16) gives the measure ¢(O) of any open set. On observ- 
ing that any closed set F is the difference between the entire plane 
&’ (an open set) and some open set O (O = g’ — F), and subtracting 


222 SET FUNCTIONS. ABSOLUTE CONTINUITY {75 


term by term the formulae 
p(B’) = f f (P)G (d2); (0) = f f (P)G (dz), 
g oO 


we find, as above, that (16) gives the measure ọ(F) of any closed 
set F. 

Let Z be a set of Lo and en a sequence of positive numbers tending 
to zero. We know that there exist sequences F, and O, of closed 
and open sets such that Fn C E C On and G(O, — Fn) = G(O,) — 
— G(Fn) < en. Since integral (16) is absolutely continuous, 9(0, — 
— F,)—> 0, so that Z belongs to L, [38]. Here, the measure of & 
in L, is obviously the limit of g( Fn) or (On), i.e. the limit of the 


integrals 
SF (P) G (a8), 
Fn 


where F,, c & and G(f — F,,) — 0; also, since the integral is absolutely 
continuous, this limit is the integral over &, i.e. the measure of g 
in L, is given by integral (16), and the theorem is proved. The theorem 
still holds under the assumption that the non-negative function f(P) 
is summable only with respect to some bounded set, and the proof 
remains the same in essence. The next theorem gives a more precise 
idea of the constitution of Ly. 

THEOREM 2. The necessary and sufficient condition for a set Z to 
belong to L, is that Z can be expressed as the sum 


g = 5 +4 Be, (39) 


where Z™ € Lg and we have f(P) = 0 at points P of &™. 
Necessity. Let Z belong to L,. We introduce the sets 


B= 8 [F (P) = 0]; 8,=8 [f (P)>1), &, = 8 [= < FP) < 5] (40) 
(n = 2, 3,...) 


and also write Z’ for the set of points at which f(P) is not defined 
or is equal to (-+co). The set Z’ is measurable with respect to G(#) 
and G(%’) = 0. The same can be said of any part of Z’. The function 
/(P), measurable with respect to G(%), is therefore also measurable 
with respect to 9(%), and all the sets (40), like Z’, belong to L,. 
We further define the sets 


GO FF + SFE, FO=FB,. (41) 


n=l 


75} ABSOLUTELY CONTINUOUS SET FUNCTIONS 223 


Formula (39) holds here. At points of #®), f(P) vanishes, and we 
have to show that Z™ ¢ Lg. The set Z’ has measure zero with respect 
to G(®), and it is sufficient to show that the sets JZ, are measurable 
with respect to G(#). Since the Zg, are measurable with respect to 
g(%), there exist closed sets F, and open sets O, such that 


F, Cc &&, CO, and p (0, —F,)<—, (42) 


where € is a given positive number. We form the sets 
D, = 6, (On — En) = En On — EnF n (43) 


ie. D, is the set of the points of On — F, at which I/n < f(P) < 
< 1/(n — 1) or f(P) > 1 (with n = 1). Since O, — Fn € Lg, and F(P) 
is measurable with respect to G(Z), the sets D, € Lo. Further, since 
Fn C Zn, it follows that Fn C En and &,F,, = F,,. Since En C On, 
it follows that 7%, CO,%, and, by (43), $F, — Fn C Dp, 80 that 
| fF, — Fale <|Dnlg. But the set D, € Lg, and we can write 
| Dn la = G(Dp), i.e. 


|En — Fr|g < G(D,). (44) 


Further, in view of (43) and „Fn = Fn, we have Dn C O,— Fa 
and we get, on making use of (42): 


[FPG (dB) = p (D,) < 0, — Fn) <=. (45) 
Da 


The set D, enters into En by (43), and f(P) > 1/n on the latter 
set. We thus have 


[FP 6 ae) > GD), 
Dy 


and inequality (45) leads to 


G (D € 

es, 
ie. G(Dn) < £. By (44), we get |2%, — Fy lg < e and, since e is 
arbitrary, this shows us that ZZ, € Lg; the necessity of condition 
(39) is proved. 

Let us prove the sufficiency. Given (39), where Z™ € Lg and 
/(P) = 0 at the points , we have to show that Z€ L,. The set 
%® € L, by Theorem 1. It remains to prove the same for ¥®), The 
set H, of all points at which /(P) = 0, is measurable with respect 


224 SET FUNCTIONS, ABSOLUTE CONTINUITY [75 


to (Z), so that H € L, and, by (16), oH) =0. But Fc H, 
so that Z” is measurable with respect to y(@) and has measure zero. 
The theorem is thus proved. Since Z® c H, so that 9(#) = 0, 
we can assert that 9(%) = ¢(), i.e. to evaluate g(%) we have 
to use (16) with Z replaced by &@). A further remark: in (39), all 
the points of the set 7, at which f(P) = 0, can be referred to F(”. 
The set of these points z OZ, is measurable with respect to Ge ). 

We shall now prove a theorem that enables us to reduce a Lebesgue- 
Stieltjes integral with respect to g(@) to an integral with respect 
to G(Z). 

THEOREM 3. If F(P) is defined, measurable with respect to o(@) and 
summable on a set E which is measurable and of finite measure with 
respect to o(&), the product F(P) f(P) is measurable with respect to GIE) 
in Z™, and we have 


| F(P) p (dé) = { F(P)f (P) G(dZ), (46) 
ë eu 
which may be written in the form 
sen pee) = F (P)f(P)G (dé). (46,) 
ë dé go) 


On continuing F(P) outside by zero, we can assume that F(P), 
like f(P), is defined everywhere. In addition, we can assume that 
F(P) and f{(P) have finite values at every point. The function F(P) 
is measurable with respect to y(@) and /(P) is measurable with respect 
to G(#), and hence also with respect to ¢(%). We introduce the new 
function F,(P) by putting F,(P) = F(P) if f(P) # 0, and FP) = 0 
if {(P) = 0. In other words, F,(P) = F(P) w(P), where w,(P) is 
the characteristic function of the set H of points at which f(P) = 0. 
As mentioned, H € Lg, so that H € L, ie. both F(P) and œp(P) 
are measurable with respect to ¢(@), i.e. Fo(P) is also measurable 
with respect to ¢(%). Let us show that F,(P) is also measurable 
with respect to G(@). Since F,(P) is measurable with respect to ¢(@), 
given any a, the set %, of the points at which F,(P) > a can be 
written as Z, = FY + ZO, where Z% € Lg and f(P) = 0 at points 
of #® If a > 0, by definition of F (P), the set ZẸ’ is absent, so 
that Z, € Lg. If a < 0, the set č, contains the whole of H, and, 
as mentioned above, we can assume in this case that Z® coincides 
with H, i.e. that f(P) > 0 at all points of $2. But H € Lg, so that 
Za = ZV + H € Lg. Hence Za € Lg for all a, F (P) is measurable 


75] ABSOLUTELY CONTINUOUS SET FUNCTIONS 225 


with respect to G(Z), and the product F (P) f(P) is therefore also 
measurable with respect to G(#). We return to the set J (mentioned 
in the theorem. At points of this set the product F (P) f(P) coincides 
with F(P) f(P), ic. F(P) f(P) is measurable with respect to G(Z#) in 
8. We shall confine ourselves in the proof of (46) to the case when 
F(P) is bounded. The proof is very similiar for unbounded functions. 
Let | F(P)| < L and @,, be the set of all P of Z at which 





k k+1 
ore < fee ae 


(k= — 2" — 274 1,..., 2” — 1). 


We form the piecewise constant functions 
F (P) =È L, if PEF ny. 


The sequence F„(P) increases as n increases and is bounded in 
absolute value bas the number L. We have 


k=—2" 
where Z} is obtained from Zn, in accordance with (39). On summing 
over all the k in question, we get the set Z®, and, on taking kL/2" 
under the integral sign in the last formula, we arrive at 


SF, (P) 9 (dé) = f F, (P) f (P) G (dé). 


The integrand on the right-hand side has an absolute value not 
exceeding the integrable function Lf(P), and we can pass to the limit 
under the integral sign on both sides, which leads us to (46). If Z 
is measurable with respect to G(Z) also, we can replace Z™ by & 
in (46), since Z? = Z — Z” and f(P) = 0 on F, so that the integral 
over &) vanishes, 

The proofs of the last three theorems are taken from Stone’s Linear 
Transformations in Hilbert Space and Their Applications To Analysis. 

If G(4) is a function of bounded variation, we can use the canonical 
form as the difference between two non-negative functions: 


G(@) = G, (2) — G, (8) 


and apply the theorem to G,(@) and G,(@). We have here, instead 
of Lg, the field Ly, where V(Z) = G,(@) + G2). We also obtain a 


226 SET FUNCTIONS. ABSOLUTE CONTINUITY [75 


representation of o(%) as the difference between two non-negative 
functions (4) = 9,(4) — ¢,(4), and instead of L, we have Ly, 
where V,(4) = 9,(4) + 9,(4). We can consider similarly the case 
when /(P) changes sign. Here we have to write f(P) as the difference 
between a positive and negative part: 


f (P) = f+ (P) — F (P) 


and prove the theorem and formula for each separate term. Now, 
¢( A) is expressible as the difference between two non-negative functions, 
and the field of sets L, is replaced by the field of sets measurable 
with respect to the function 


y (2) = f |} (P) |G (a2). 
T 


The theorem can similarly be extended to the case of complex 
functions /(P) and G(@). 

Let us consider further the form that the theorem takes in the 
case of one variable. Let g(x) be a non-decreasing and bounded 
function on the finite interval [a,b] and f(x) be non-negative and 
summable with respect to g(x) on [a,b]. We consider the function 

w(x) = f f(x) dg (x). 
(a, x] 

Every set measurable with respect to g(z) will also be measurable 
with respect to w(x), and the necessary and sufficient condition 
for the set to be measurable with respect to w(z) is that it can 
be written in the form (39), where ¥") is measurable with respect 
to g(x) and f(x) = 0 at all points of Z”. If F(x) is summable with 
respect to w(x) on the set measurable with respect to w(x), we have 


JR aj S AO dga = LF f le) daa), (47) 
Cf 


[a, x} 


where &™ is the part of Z appearing in (39). If F(x) on F is measurable 
with respect to g(x), Z © can be replaced by &. We obtain when 
gl) = z: 


fra] fro à| = f F (x) f (x) dz. (48) 
g a eo 


Hence, if w(x) has no singular term, the Lebesgue-Stieltjes integral 
of F(x) with respect to w(z) is given by the Lebesgue integral. If & 


75] ABSOLUTELY CONTINUOUS SET FUNCTIONS 227 


is the interval [a, b], (47) and (48) can be written as 


f Pear f PON) = f F(x) f (x) dg (2), (49) 


[a,b] [a,x] [a,b 
b x b 
(F waf fFe) az = Í F (æ)f (a) de, (49,) 


where e.g. in the latter formula F(x) is assumed summable with respect 
to the indefinite integral of f(x). Suppose that P(x) and ¥ (x) are 
absolutely continuous functions, i.e. 


D(z) =f" (x) dz +O; Y (x) = FP" (zdr +0, (50) 


where Ọ’(x) and ¥’(x) are summable on [a, b]. On using (49), we can 
write: 


a 


fo (x) P (£) da + fv (x) © (x) da = fo (x) d¥ (x) + fy (x) d® (x), 


where the integrals on the right are ordinary Stieltjes integrals, 
since (x) and W(x) are continuous and of bounded variation. We have 
the formula for the right-hand side [2]: 


| O (x) dP (x) + [E (2) d8 (x) = [D (x) Pw) ELS 


and substitution in the previous formula gives us the formula for 
integration by parts: 


b b 
J D(x) P (x) da + |F (x) D' (x) dx = [Ð (x) P (x)]ž z3. (51) 


It follows at once from (50) that, in the case of the sum ®(x) + 
+ W(x), the integrand is equal to ®’(z) + P(x), i.e. [P(x) + P(x)’ = 
= ©’(x) + ¥’(x). On putting b = x in (51), we get 

b 
D (x) Y (x) = f [P (x) F (x) + 8 (x) P" (x)] dx + ® (a) Y (a), 
i.e. 
[P (x) F (x)]’ = D (x) Y (x) + D(x) P' (x). 


The case of an infinite interval can be treated similarly. 


228 SET FUNCTIONS. ABSOLUTE CONTINUITY [76 


76. Example. We shall now give an example of a non-decreasing 
continuous function which is not absolutely continuous, and for 
which the second, i.e. absolutely continuous, term is absent in (20). 
(20). We start by forming a closed set F, on the interval [0, 1]. 
We split [0, 1] into three equal parts by the points 1/3 and 2/3, then 
remove the central open interval (1/3, 2/3). We divide each of the 
remaining intervals [0, 1/3] and [2/3, 1] into three equal parts: the first 
by the points 1/9 and 2/9, and the second by the points 7/9 and 8/9. 
We then remove from each of these intervals the central parts (1/9, 
2/9) and (7/9, 8/9). Each of the remaining intervals: 


C  Fb FE 


is again divided into three equal parts and the central open interval 
removed, and so on. Thus all in all we remove from [0, 1] a denu- 
merable number of open intervals having no common points or even 
common ends: 


ee ee ae Se eS eae ee 
yr) (> 9)’ m ar)’ (area): a ar) 
25 26 
(ahoo (52) 


i.e. we remove an open set Ho, the set that remains, which we write 
as Fẹ being closed. One open interval of length 1/3 is removed in 
the first step, two of length 1/3? in the second, 2? of length 1/3? in the 
third, and in general, 2”~* intervals of length 1/3" in the nth step. 
The Lebesgue measure of the open set H, is thus equal to 
l 
~ 9n-1 3 
=< “gn = i 2 = 1 
oer 





and the set F, remaining on the interval [0, 1] thus has measure zero. 
We now define a function f(x) on [0, 1] as follows: we put 


i 2 
f(a, if se. =); 
; s 7 8 
f (2) =~ if zey =); f (x) = +, if ze> ji 
in general, we put f(x) equal to 1/2", 3/2”, 5/2", ..., (27 — 1)/2" in the 


sequence (from left to right) of intervals which we remove at the nth 
step. Thus f(x) is so far defined at points of the set H, and is constant 


76] EXAMPLE 229 


in each of the open intervals (51) of which this set is composed. 
We further define f(x) at the ends of [0,1] by putting /(0) = 0 and 
/(1) = 1. The principle in accordance with which we have defined 
f(z) on each of intervals (52) is as follows: in each interval of the 
set H, obtained at the nth step, we put f(x) equal to the arithmetic 
mean of its values in the neighbouring intervals obtained earlier, 
or at the ends of [0,1] if there are no previously obtained intervals 
of H, on one side of our new interval of H,. It follows directly from 
this that f(z) is a non-decreasing function on the set Hy. Let us 
continue the definition of f(z) on to Fy. Let x) € Fy. Since F, has 
measure zero, there is a point of H, in any e-neighbourhood of 2), 
and if x approaches z, from the left in the set H,, f(x) is non-decreasing 
and has a limit, which we take as the value of f(z) at x = zy. In other 
words, our definition amounts to this: we take f(x) equal to the 
strict upper bound of the values of f(x) for x less than x, and belonging 
to H, At x = 1, this definition obviously leads to the previous value 
f(1) = 1. We have thus defined throughout [0,1] a function which 
is clearly non-decreasing. It may easily be seen to be continuous. 
For, if it had a discontinuity at x = 2’, at least one of the intervals 
[f(x — 0), f(x’)] or [f(x’), f(x’ + 0)] would not reduce to a point and 
would not contain values of f(x) inside itself because f(x) is monotonic. 
But the values of f(x), defined above only on the set Hy, are everywhere 
dense in [0,1], and we have arrived at an absurdity by assuming 
a discontinuity of f(x). We recall that f(x) is constant on each of 
intervals (52). On the basis of the non-decreasing continuous function 
f(x), we can form a completely additive non-negative set function 
g(@), which is always defined on B-sets. By what has been said, 
9(H,) = 0, and all the more, g(@) = 0 on every B-set forming part 
of H,. If we take the interval [0, x], we can write: 


[0,2] = [0,2] Ho + [0,2] Fo, 
so that 
f(x) = p (0, z]) = (l 0, 7] Ho) + 9([9, 2] Fo). 
The first term is zero by what has been said, whilst the measure 
of F, is zero, i.e. f(x) reduces to a singular part [74]: 
f(x) = p ([0, x] Fo), 


where F, plays the role of H in (20) and f(x) the role of w(z). 
Let us investigate further the set F. The continuous non-decreasing 
function f(x) takes all real values from zero to unity. On each of the 


230 SET FUNCTIONS. ABSOLUTE CONTINUITY [77 


excluded intervals, including the ends, f(x) is constant, the set of 
excluded intervals being denumerable. The set of all values of f(x) 
is not denumerable (has the power of a continuum). It is therefore 
evident that F, contains points different from the ends of the excluded 
intervals. It can be shown that F, has the power of a continuum. 


77. Absolutely continuous functions of several variables. Such functions can 
be introduced along the same lines as for absolutely continuous functions of 
a single variable (a point) [74]. We shall confine ourselves to functions of two 
variables. Let F(x,y) be a given continuous function in the two-dimensional 
interval 4,[a < x < b;¢ < y < d]. With the aid of this we can form a function 
¢(6) of an interval contained in 4, i.e. if 6 is the interval defined by 2, < x < 
< Zy} Yı SY < Yz we put, as before: 


p (8) = F (£, Y2) — F (£1, Ya) — F (£2, Y1) + F (a1, Y1) (53) 


where it is of no importance whether 6 is open or not, since F(x, y) is continuous 
by hypothesis. If we add to F(x,y) the sum f(x) + f,(y), in which the first 
term depends only on x and the second only on y, this has no effect on 9(6). 
The interval function (6) is said to be absolutely continuous if it satisfies 
a condition analogous to (24) of [74], i.e. if, given € > 0, there is a correspond- 
ing 7 > 0 such that, when the sum of the areas of the non-overlapping intervals 
ôk (k = 1, 2, ... n) is < y, we have 


X P (ôk) 


k=1 


<é. 








DEFINITION. F(x, y) ie described as an absolutely continuous function of two 
variables (x, y) if p(ô) defined by (53) is an absolutely continuous function of an 
interval and if, in addition, F(a, y) and F(x, c) are absolutely continuous functions 
of y and x. 

The latter proviso regarding the absolute continuity of F(z, y) on the lower 
and left-hand sides of the interval 4, is necessary because of the possibility 
of adding the sum f,(x) + f.(y) to F(x, y). We write down the obvious equation: 


F (x,y) = [F (z, y) — F (a, y) — F (z, c) + F @,¢)] + [F(z,¢) — F (a, ¢)] + 
+ [F (a, y) — F (a, ¢)] + F (a,c). 


The first term on the right-hand side is 9(d, y), where 6, y is the interval 
a<2“’<a,c<y’ <y, and, as in [75], this function can be expressed as 
an indefinite double integral of a summable function. The second and third 
terms on the right-hand side are absolutely continuous functions of v and y, 
and hence are expressible as simple indefinite integrals. Thus every absolutely 
continuous function F(z, y) can be expressed as 


xy x y 
F (æ, y) = Í Sf (x,y) dz dy + f g (z) dæ + $ h (y) dy + F (a, c). (54) 
e 


ac a 


77) ABSOLUTELY CONTINUOUS FUNCTIONS OF SEVERAL VARIABLES 231 


It may easily be seen that, conversely, every function expressible by the last 
formula is absolutely continuous. We can use Fubini’s theorem to rewrite the 
last formula as 


x y y 
F (z, y) = $ [ Í f (æ, y) dy +g (x)] dz +f h (y) dy + F (a, c), (55) 
ae e 


or 


F (a, y) = fifre det hy)]au+ jo ede+ Fe c). (56) 


It is clear from this that, if F(x, y) is an absolutely continuous function of 
two variables, it is an absolutely continuous function of x for any fixed value of 
y, and an absolutely continuous function of y for any fixed x. The converse is 
false, i.e. a function may be absolutely continuous in each variable yet not 
be absolutely continuous in both. 

By the definition of [74], the integrands of the first terms of (55) and (56) 
yield the partial derivatives of the absolutely continuous function F(x, y): 


or E y) oF E y) 


-fenas -fe y)de+h(y). (57) 


The integrand in these formulae defines the second order mixed derivative: 


If the partial derivatives Fy and F, are themselves absolutely continuous 
functions of two variables, we can define all the second order partial derivatives. 
If all these are absolutely continuous functions of two variables, we can define 
all the third order derivatives, and so on. 

It can be shown that the partial derivatives are the limits of the corresponding 
ratios almost everywhere in 4,. For instance, F, is the limit of the ratio 
[F(x + h, y) — F(x, y)]/h. An absolutely continuous function F(x,y) can be 
interpreted as a function of a point F(M) on the plane. If we introduce new 
Cartesian coordinates (z’, y’) in place of the old (x, y) on this plane, we obtain 
a new function F(x’, y’), which may be no longer absolutely continuous in the 
new variables. Let us take as an example the absolutely continuous function 


F (x, y) = J f (t) de, 
0 


where f(t) is the continuous, but not absolutely continuous, function that we 
constructed in [76] for the interval [0, 1]. We continue it by assuming f(x) = 
= 0 for x < 0 and f(x) = 1 for x > 1. The above formula defines an absolutely 
continuous function F(x, y) throughout the plane (it in fact depends only on 
x). On rotating the axes 45 degrees about the origin, we obtain in the new 
coordinates: 


1 
Ve +y) 
Fiy f f(t) de, 


232 SET FUNCTIONS. ABSOLUTE CONTINU ITY [78 


The partial derivative of this function with respect to x’, given by 


1 oP ie SEN 
n=l ta”) 
is not an absolutely continuous function of y’ for a given x’, as must be the 
case, by (57), if F(x’, y’) is to be an absolutely continuous function of two variables. 
Notice that, whatever the choice of Cartesian coordinates, the function con- 
structed is absolutely continuous with respect to each variable for all values 
of the other variable. A theory of absolutely continuous functions of any 
number of variables can be built up on the same lines as above. 
A more general definition of partial derivatives will be given in the next 
chapter, applicable to a wider class than absolutely continuous functions of 
several variables. 


78. Supplementary propositions. In this section and the next we 
shall introduce some new concepts and prove supplementary pro- 
positions needed for the proof of the fundamental theorem of [73] 
and for further generalization of the concept of integral. 

We take functions 9(%), completely additive on the family C 
consisting of some set &, of Lg and of all the sets of Lg forming 
part of Zp» G(®) being assumed finite and non-zero. We write V, 
for the set of all such functions. If 9,(%) and 9,(@) € V,, then 
Cip (Z) + C.p,(F) also € V,. As we know, for any 9(@) of Vy, the 
sums 


t (p) = | 9 (F,) | (58) 
k 


remain bounded for any subdivision of &, into a finite number of 
sets J, [72]. We write || p ||, for the strict upper bound of the sums 
ts, i.e. the total variation of »(%) on E. We obviously have || c o ||, = 
= |e |- || ø [l where c is a constant. If the subdivision ô’ is a con- 
tinuation of the subdivision 5, we write 6’ > 6. On observing that, 
given any subdivision e =e’ +e”, we have | ¢(e)| < | y(e’)| + 
+ | ple”) |, we can say that t;(~) > t(p) if 6’ > ô. If ôn is a sequence 
of subdivisions such that t, (p) — || p ||, and ôn > ôn, then all the 
more f5.(~) — || p Ilr I£ ta (p) > Il e Ili tal) > I] Y Ili and talp + p> 
—> || + yl, then we obtain on putting ôn’ = 6,676): 


t (lels alel t p+ [le + vl 
If Zin are subsets in ôy’, the inequality 
|e (Fain) + ¥ (Fin) |< lP Een) |+ lY Fen) | 
gives us, on summation and passage to the limit: 


le +yli < leih + ly ll- (59) 


78} SUPPLEMENTARY PROPOSITIONS 233 


For functions 9(%), satisfying the condition 


¢g(@)=0 for G(é)=0, (60) 
we consider, in addition to sum (58), the sum 
= P? (Ex) 
S, (p) ii a G (Ex) j (61) 


If G(#,) = 0, the corresponding term has the form 0/0, and we 
regard it as zero. We could dispense with this proviso if we agreed 
to consider only those ô for which all the G(#,) # 0, which amounts 
to associating all the Zx for which G(#,) = 0 with a different subset. 
The set of values of S(p) is not necessarily hounded. Let us write 
V, for the set of the functions ¢(%) which satisfy condition (60) and 
for which the set S(p) is bounded. The set V, forms part of V.. 
It is easily seen that, if o(%) € V, and y(%) € V, then cg(%) €V, 
and off) + y(%) € V.. The strict upper bound of the sums S(p) 
will be written as || y |l» Let us show that, if 6’ > ô, then S(p) > 
> S(p). We only need to show that, if Z = &’ + g” is a subdivision 
of Z, then 

2 (8) AGA 8) 
Fey + Stay > Seay: (62) 


This inequality is equivalent to the following: 

G(%) G ("p (Z) + G (F) G (B") p (F") — G (2) G (S") p? (F) > 0, 
which can be rewritten, since G(Z) = G(%’) + G(F") and o(%) = 
= 9(%’) + g(F"), in the form 

[G (F") g (E) —G(S') go (F")P > 0. 
As above, if 9, (p)— || 9 |lz and ôn > ôn, then Saly) > Il Y lle- 


We now establish an inequality between || ||, and || ø ||, for 
functions of V,. Application of Cauchy’s inequality gives us 

















VG (Ep) 
Ex) 
JZE ae soe G (Z) = -|> he a VG (Zo) (Zo) > 
i.e. (p) < V8, lp) ) VG (Z) . As when deducing (59), we can form 


a oe of ae ôn such that t (~) —> || ¢ ||, and S, (p) > 
— || ¢ |l» and the inequality ts (p) < VS p) VG (Z) gives in the 
limit: 


lela < Vilele VGZ). (63) 


234 SET FUNCTIONS, ABSOLUTE CONTINUITY [78 


One family of functions g(#) must be mentioned. A function 
g(&) is said to be essentially bounded if there exists a constant C 
such that, given any & of Lg belonging to Z, we have 


|p (8) | < CG (@), (64) 


and the set of all essentially bounded functions (with different 
values of C) will be denoted by V,. It follows at once from (61) that, 
if o(%) € Vo, then Saly) < C2 G(Z.) for any 6, ie. V, forms part 
of V. 

We have already discussed piecewise constant point functions 
[46]. We now introduce piecewise constant set functions. A »(Z@) 
satisfying (60) is said to be piecewise constant in @, if there exists 
a subdivision of Z, into a finite number of sets &, such that 





v(8)= Ge OB), it FcR, and OF) 40; 


y(@)=0, if Fc&, and G(F,) =0. 


(65) 


If the ratio y(Z,)/G(%;,) is denoted by a;, (we take a, = 0 if G(%,) = 
= 0), the piecewise constant »(@) in question can obviously be 
written as the integral 


(5) = | Sa x, (P)G (d2), (66) 
k 


€ 


where œw (P) is the characteristic function of the set &,. We have 
under the integral sign a piecewise constant point function, equal 
to a, on the set Zx. Conversely, every integral of the type indi- 
cated gives a piecewise constant set function y(8). 

Given any subdivision ô of the set %, into a finite number of 
sets r, we can associate piecewise constant functions g,(#) and 
fa(P) with any set function 9(%), completely additive on g, and 
satisfying (60), and with any point function f(P), measurable and 
summable on g, We define ,(@) for & belonging to Z, by the 
formula 

pa lE) _ _ (Ex) (67) 


G (8) G (ék) ’ 





i.e. for any Z of Z, by the formula 





a é 
pa (2) =J2 aroen (P)G (dg), where a, = A ; (68) 


78] SUPPLEMENTARY PROPOSITIONS 235 
We put /,(£) equal to a constant: 


AP) = gy NPA), (69) 


if P € &,. If G(F,) = 0, (69) is taken equal to zero. If g(#) is expressed 
by the integral: 
= J HP) G8), (70) 
then obviously: 
= f fP) ads). (71) 
It follows at once from the definitions of 9,(%) and f,(P) that: 
(c PE) + c vZ) = C1, PaE) + e PaE); 
(¢, f(P) + ¢, F(P)), = 61 FAP) + c FAP). | 
Let us show that, if 6’ > ô, then 
(pZ) = PaE ). (73) 


With ô’, we have a subdivision of each %, into sets Zk s, where’ 
by (67): 


(72) 


Paks) _ (Ex) 


= GB) GE)’ 








whence it follows that 
(eB) = J Xeae ern, P) (AB) = f Bray Z w4ry,s(P) HAT) = 
= =f Saal? ) GAZ) = oF) . 


If c is the greatest of the ax, by (65), | p(&) | < cG(#), i.e. every 
piecewise constant set function is essentially bounded. Notice 
that we are considering piecewise constant functions only with 
subdivisions of Z, into a finite number of sets. 

Let 6’ > 6 and Eks be the above-mentioned subsets. It follows 
from the definition of 9,(%) that 


p (Ek, s) Palé) 
GE, a Ges 5 PF) = (Fy), 

















so that 
_ N Palës) Polk) Z 
a) =< 2 QE) =a GE, y Pa (Eh, s) Pra 
= (Ex) = P(E) 
= = GE) PF x) = PATAN , 
i.e. 


Sylpa) = SP) - (14) 


236 SET FUNCTIONS, ABSOLUTE CONTINUITY [78 


If Ss (ys) — || Poll then all the more Salpa) > II Pa Ile when 
ôn = 6,6. But ôn > 6 and, by (74), Sa(ps) = S(p) and does not 
depend on n, whence it follows that 


Il Palle = Sap) - (75) 


If y(&) € Va there exists a sequence of subdivisions 6” such that 
Sô (p) > || p ll and hence there exists a sequence of subdivisions 
such that 

Il pom lla —> || 9 la - (76) 


The quantity (75) is easily expressed by an integral. Let g,(P) 
be the integrand in integral (66) for the piecewise constant function 
pal): f 

9P) = a = P(E) : 0 (Er) if PEZ. 
We have 


GE.) = PE) = a G(F,) = AE, 3 RIP) GAZ), 


so that 





= Pld 
SAP) = È Gey = = J AP) Haz). 
i.e. 


SAP) = || Pa ll = J PIP) G(a&) , (77) 


and, on using the usual notation for the norm in L,: 
|| Poll. = I| gP) lle, - (78) 


Let us bring in two further formulae, required later. By (73), we 
have [ o(%) — 9(&) le = s(%) — pa(Z) for 6’ > ô. The difference 
gs(P) — ga (P), which is constant at points of the subsets %;, of 
the subdivision 6’, is the integrand in formula (66) for 9(%) — 
— 9,(@), ie. 


ey(B) — 98) = J [gel P) — gP) HB), 
and we can write, by (78): 


Ie — Paar fle = ilp — Pelle = Il Ga(P) — ga(P) lle, - (79) 


The second formula concerns the function g,{P). It has the con- 
stant value aks = 9(Z;,s)/G(F;,s) at points of Zk, and, by the definition 


79] SUPPLEMENTARY PROPOSITIONS (CONTINUED) 237 


of (P), the function (g,(P)); takes at points of Z, the constant 
value 


ne ; pék, ) g — Ex) 
C CORA mk. + HAF) = Ge = geg Ek) = Gey 











i.e. 


(ga (P))a = 9a P) - (80) 


79. Supplementary propositions (continued), A new concept must 
be introduced, important for later constructions. Let g(%) and 
(Z) € V, and Z = Z’ + 2" be a subdivision of into two dis- 
joint subsets. We write inf [y, y] for the strict lower bound of the 
sums g(%’) + y(%”) for all possible subdivisions of Z: 


inf [p, y] = Pie [o(F") + o(F")] = of), (81) 
i.e. 


WF’) + pF") > wg), (82) 


and, given any positive £, there exists a subdivision = %’ + 2” 
such that 
pF’) + pE’) < oF) +. (83) 


For any č of &,, the function w(#) has a finite value, since ¢(Z’) 
and »(%") are bounded. If we take Z itself as J’ and the empty set 
as S", or vice versa, we get 


of) <p); oF) < yE). (84) 


Let us show that w(%) is completely additive. Given a subdivision 
of & into a finite or infinite number of (disjoint) sets Zp, we have 


= = PZ); 2) = 2 (Fx) ; 


these series being absolutely convergent. By (84), the positive w(@;) 
form a convergent series, since the series with terms 9(%;,) and (8x) 
are absolutely convergent, so that the whole of the series formed 
from the w(@;,) has a definite sum, independent of the order of the 
terms [I; 134]. Given £ > 0, there exists a subdivision J = g’ + 2’, 
such that 


oF’) + oF") > oF) and gZ’) + pF") < w(F)+e. (85) 


238 SET FUNCTIONS, ABSOLUTE CONTINUITY [79 
We write Z; = 3,0’ and Zg = 2,8", so that 
G4 =F, 2 ="; 2, =8 
On using the definition of w(%), we can write (fi) + w(F,) > 


> w(%,), and we obtain on summing over k, since 9(%) and y(Z%) 
are completely additive: 


2 Fs) < oF) +E"), 
and the second of inequalities (85) gives 
Zog) < oF) +e, 
whence, since e is arbitrary, we have 


& (8 ,) < < o(ĝ). (86) 


We now prove the reverse inequality. We take a subdivision x = 
= 8%, + Zra such that 


Ena) + VF a2) < O18) HE - (87) 


On summing over k, and writing 2, for the sum of the Zp, and 
Z, for the sum of the xz we get 


p(&) + YZ) < 2 og) +e, (88) 


where 8,8, =0 and Z, + =g. But p (F!) + y) > oZ), by 
the definition of w(%), and inequality (88) gives 


(Z) < 2 (8) +e, 


and, since z is arbitrary, we get the reverse of inequality (86), whence 
it follows that w(%) is completely additive: 


olg) = 2 (8%) ; 


It follows from what has been said that the w(#) defined by (81) 
belongs to V.. 
We introduce the further notation for any function 9(@) of V,: 


Prl) = inf [p, nG]. (89) 


79) SUPPLEMENTARY PROPOSITIONS (CONTINUED) 239 


On observing that G(%) > 0, we can say that 9n4,(%) > gn(Z), 
and in addition, by (84), on(@) < g(#). Thus, given any & belonging 
to Z» the sequence ¢,(#) has a finite limit as n> co, Let us also 
recall the definition of absolute continuity: 

y(@) is said to be absolutely continuous on @, if, given any e > 0, 
there exists an 7 > 0 such that | 9(%)| < e if Z c Fy, F belongs 
to Lg and G(@) < 7. 

Lemma 1. If g(@) is absolutely continuous on E, then ynl) —> (£) 
for any @. 

Given € > 0, we have for some subdivision = #), + Zh, by (83): 


GF) + nG(Fn) < PrE) +e <p) +e. (90) 


But, by Theorem 1 of [72], (n) > l, where l is a definite number, 
and it follows from (90) that 


QE < MOS , (91) 


where the right-hand side is < y for all sufficiently large n, whence, 
in view of the absolute continuity of ọ(f), we have | ¢(g})] < € 
for sufficiently large n. The first of inequalities (90) gives o(%,) < 
< gn(Z) + £. But p24) = 9%) — pi), so that 9B) < p(?) + 
+ e+ p(n), and, since | o(F;) | < £, we have o(%) — on(%) < 2e, 
whence it follows, since € is arbitrary, that 9,(%) > 9(@). 

Lemma 2. For a non-negative completely additive function o(&), 
the limit of p,(&) is a completely additive function, completely continuous 
on & o 

We use the notation lim 9,(%) = go“ (Z). On observing that the 


n= 
Pn() are completely additive and non-negative, and using the 
lemma of [63], we can say that g(%) is completely additive. It 
follows from (84) that 0 < 9,(@) < nG(&), so that each of the o,(Z) 
is absolutely continuous. 
Further, since 9 (%,— Z) > 9,(%,—&), where & cp it 
follows that oy (g) — p°? (Z) > gn(B) — pal), ie. 
0 < PAE) — GF) < PENZI) — Prl) - (92) 


Hence it is clear that the ¢,(Z) tend to g?(%) uniformly with 
respect to the whole of & of Z,. Since the p,(%) are absolutely cont- 
inuous, we can say that g(%) is also absolutely continuous on & 9. 
For, given an £ > 0, we can fix an n = n, such that op (%,) — 
— Pro) < 2/2. Now, by (92), we have 9 (%) < gn,(&) + ¢/2. 
In view of the absolute continuity of y,,(@), there exists a positive 


240 SET FUNCTIONS, ABSOLUTE CONTINUITY [79 


n, such that gn, (8) < £/2 if Z c Z, and Gif) < 7, and it follows 
from the inequality written above that g@(%) < e if Z c Z, and 
G(Z) < n, and the lemma is proved. 

Lemma 3. If 9(@) € V,, is non-negative and absolutely continuous 
on E» given any e > 0, there is an essentially bounded function y(2) 
such that 

lg—= yli <e. (93) 

It follows from (84) that, as mentioned above, 0 < png) < nG(@), 
i.e. each of the ¢,(@) is essentially bounded. In addition, »(@) — 
— pnl) > 0, so that the total variation of this difference on 2, 
is equal to its value for & p i.e. || 92) — on(F) |, = (Fo) — onl 0). 
But, since g(@) is absolutely continuous, we have by Lemma 1: 
PE) > (Fo), ie. || og) — pn) ||; > 0, and we can satisfy 
inequality (93) simply by taking as y(@) the function 9,(@) with some 
sufficiently large n. As a matter of fact, (93) can be satisfied by 
choosing as (f) a function piecewise constant on & 5. We shall 
prove this as a preliminary for functions 9(%) of V. 

THEOREM 1. If 9(%) € V2, given any e > O, there exists a piecewise 
constant function w(@) such that 


llp—ll, <e. (94) 
Let f(P) be a measurable function of L, on &, and 


a = aay [AP Ode), 
č 


this expression being assumed equal to zero if G(@) = 0. We have 
the obvious equation: 


SUP) — a]? (db) = J fP) Gas) — a 08). 


On putting Z = &, and a= SAP) G G(d@) | G(F,), where &p are 


the subsets of some subdivision 6 of the set €, and summing over k, 
we get 

AP) — fol P) 2, = IAP) lis — Uf fal-P) le, - (95) 
If we take f(P) = 9;(P), where ô’ > 6 and g,(P) is the function 
appearing in (77), by (80), we have f,(P) = 9,(P) and (95) gives: 
lga lP) — gal P) |l, = Il ge tP) lls — tl gol P) Ile, - (98) 

On taking (78) and (79) into account, we can write 
Ie — Pav lla = I Pa Ile — Il Pa llo- (97) 


80) FUNDAMENTAL THEOREM 241 


By (76), there exist sequences of subdivisions 4, and 6, such that 

Il (p — pola, l2 —> II p — go Ila and || pa; Ile > {ly Ila For the sequence 

Ôn = nôn, we have all the more: |] (p — pe); Ila —> || p — pə {lp and 

I! Paz l2 — Il p ll On putting 6’ = 6, in (97), and passing to the 
limit, we get: 

lle — Palle = lle lle — Il Palle - (98) 


On again taking (76) into account, we can say that there exists 
a subdivision 6 such that the right-hand side of (98) < e, and, putting 
wl) = pZ), we get (94). 

THEOREM 2. If g(%) € V, and is absolutely continuous in Fy, given 
any e > 0 there exists a function w(f), piecewise constant on & 4, 
such that 

llg—olk<e. (99) 


We can write g as the difference between two non-negative functions 
of V,: o(%) = p (Z) — p), and if the piecewise constant functions 
œ (f) and w,(%) exist, such that || p, — @, ||, < £/2 and || p, — 
— œ || < /2, we obtain by (59), on introducing the piecewise 
constant function w(%) = w,(%) + œg): 


lp — øli < Ilp — o lhi + Ilp — olli <e, 


and it is therefore sufficient to prove the theorem for the case of 
non-negative functions 9(Z). 

By Lemma 3, there exists an essentially bounded function »(#) 
such that || p — y ||, < e/2. The function y(@) € V, [78], and hence 
there exists a piecewise constant function w(%) such that || y — @ ||, < 
< /[4G(% y)}. By (63), we have || y — w ||, < e/2,and we can write, 
on using (59): 


le—ol, <lg—vh+iv-oh<f+ pe, 


and the theorem is proved. The assertion of Theorem 1 amounts to 
the fact that the piecewise constant functions are everywhere dense 
in V, with norm |] ¢ |l» whilst that of Theorem 2 amounts to saying 
that the piecewise constant functions are everywhere dense in the 
space of absolutely continuous functions of V, with norm || ¢ Ili- 


80. Fundamental theorem. We turn to the proof of the fundamental 
theorem of [73]. As above, it is sufficient to prove the theorem for 
non-negative functions ¢(%) of V,. As usual, we write “° (g) for 


242 SET FUNOTIONS. ABSOLUTE CONTINUITY [80 


the limit of the y,(@) defined by (89). By Lemmas 1 and 2, the equation 
pe? (Z) = o(%) holds when and only when 9(Z) is absolutely con- 
tinuous in Z. Let g(#) be not absolutely continuous. We form the 
function 


PF) = HF) — PB), (100) 


non-negative and completely additive in Z. 
We show that 9%) is the singular term in (14) of [73]. We form 


PF) = inf [~, nG]= inf [pg — p(B’) + nag"). (101) 
E= +e" 
We recall the subdivision = Zp + Z}, satisfying condition (90), 


and inequality (91), by virtue of which G(g}) -> 0 as n—> œ. On 
taking definition (101) into account, we can write 


(NE) < pln) + nG(Fn) — PONZ) = 
= P(n) + NGE) — PNZ) — PFn)] . 
i.e., by (90), 
PF) < OF) + € — [pPOHF) — yO F7)] . 


But p° (Z) is absolutely continuous, so that we have 0 < 9 (B71) < 
< e for all sufficiently large n, i.e. 


PNZ) < Prl) — POF) + 2e. 
Recalling that p,(%) > 9 (Z), we get E pS (Z) < 2z, ie. this 


limit is zero for any @, since e is atbitrery, But gS) are non- 
negative and do not decrease as n increases, so that we have for 
any n: 

POF) = Ae (p(B) + nG(F")] = 0. 


On applying this to Z, with n = 1, we can say that, given any 
e > 0, there exist sets %, belonging to Z, of Lg such that 
PAE) + HF — Fn) <r 
whence, since y®(%) and G(#) are non-negative, 


PNZ n) < ri and Ge, = Zn) < : 


=. (102) 


We form the set Z; = Z, + Z, + ..., belonging to Z, and Lg. 
On observing that Zn c gZ; for any n, we can write G(F, — &) < 


80] FUNDAMENTAL THEOREM 243 


< e/2” and, on letting n tend to infinity, we get Gif, — 1) = 0. 
On the other hand, on observing that 


5, = 8, + (2—1) + (F,-8,—F,) +..., 


and using the first of inequalities (102) and the fact that g®(Z) is 
non-negative, we get (SF!) < e. Thus 


gp) (gZ!) <e and GF, — Z!) =0. 


On writing Z, — Z; = Z, wo can say that, given any e> 0, 
there exists a set F, of Z, such that 


gy) (Za — g.) <e and G(%,) =0. 
Let £n be positive and — 0. We can write 
Plo — Ean) < En and G(%,,) =0. 


We introduce the set H = 8., +Z., +... In view of the last 
inequality, G(H) = 0, and, on observing that &, c H for any n, 
we have by the first inequality: ¢(%. — H) < En, and, on letting n 
tend to infinity, we get g°(%, — H) = 0. Thus there exists an H 
such that 

gy (Za — H) =0 and G(H) = 0. (103) 


Any =H + (č — ZH), but Z — ZH c &, — H, and, since 
p°(&) is positive, the first of formulae (103) gives »°(% — ZH) = 
= 0, so that g°(#) = o(FH). Thus there exists an H such that 


p9 (Z) = p (FH) and G(H) = 0. 


Hence g®(@) is the singular term in (14) of [73], and on taking 
into account (100) and the absolute continuity of gy (g), we need 
to prove the following theorem in order to prove the theorem of [73]: 

THEOREM 3. Any absolutely continuous function (g) of V, can be 
expressed by the integral 

oF) = f AP) GZ), (104) 
č 


where f(P) is measurable and integrable on gp. 
By Theorem 2, there exist piecewise constant functions w,(Z) 


such that 
1 
lp — valh < a> (105) 


whence it follows, by (59), that 


1 
lont — Palla < [LP — onsi + Ile — Onl < Se 


244 SET FUNCTIONS. ABSOLUTE OONTINUITY (80 
But every w,(%) is the integral of a piecewise constant function 
gn(P) with a finite number of finite values on %, [78]: 
č 


and 


Ont (2) = wn ( (8) = FF [9n41(P — In (P)] G(d@). 


The total variation of this set function, expressible by an integ- 
ral, is given by [73]: 
lont T wrlh = f n+ (P)— 9n (P)| G(dé), 
& 
whence 
] 
[lanes (P) — gn (PI GAB) < =, (106) 
čo 


and the series consisting of these integrals is convergent. The series 
lg, (P)| + lga (P) — 91 (P)| + lgs(P) — 9: (PI +--- (107) 


is now convergent almost everywhere in Z, [54], and all the more, 
the series 


KP) =g, (P) + [92(P) — g1 (P)] + [gs (P) — g (P) +- -- 


is convergent almost everywhere, i.e. gal P) — f(P) almost everywhere 
in &,. The sum of series (107) is an integrable function on 3, by virtue 
of (106). But | f(P) | < this sum, so that f(P) is also integrable. 
We can write 


FP) = ga (P) + ligna (P) — gn (P) + (9n42(P) — gna (P) +e’ 


and, on taking (106) into account, we obtain for any @ of Z.: 


(iP) — on (P GUB) < 54+ she te = 
d 


gn-1 


whence it follows at once that 


lim w, (#) = lim f On (P) G(d&) = § AP) Gdg) 
n= n= ¥ 


č 


for any &. On the other hand, it follows from the definition of the 
total variation over 2, and (105) that 


P2) — ©, (B) < ||? — onli <a 


81] HELLINGER’S INTEGRALS 245 


whence w,(2) > 9(&) for any Ø, i.e. 
= f AP) G (a8), 
č 


and the theorem is proved. We have so far assumed that G(%,) is a 
finite number. If G(Z,) = +œ, the result is obtained by a passage 
to the limit from the sets Zn with a finite value of G(%,), for which 
the theorem is proved, f(P) being independent of n. 


81. Hellinger’s integrals. Let us investigate in more detail the 
family V,. We note first of all that, in view of condition (60), if 
(8) € V,, it is absolutely continuous. We have (98) 


lp — Palle = llel — [IPalla (108) 
and, in addition [78]: 


ah g9(P) G(d&) and |p|], = jar Ga). (109) 


By (76), we can choose a sequence of subdivisions of 7, so as to 


have 
l 


———. 110 
at G(E,) al 


lp — Palle = liella — llPal2 < 


We now have, by (62): || p — 9, Ih < 1/2"*1, and it follows from 
the proof of the theorem of the previous section that g, (P) —> /(P) 
almost everywhere on &, and 


= f f(P)G(d2). (111) 
č 
It follows from (100) and (110) that now: 
§ 93, (P) GAZ) < |Ill, and f 93, (P) GAZ) — lelle- 
& Čo 


On using Theorem 4 of [54], we can write 


jr P (P) G(d&) < lloll (112) 


whence it follows that f(P) € L, on Z, We shall now prove the 
reverse inequality to (112). 
On applying Buniakowski’s inequality to (111), we can write 


P (E9 < f PIP) GAB) f OAB) = C18, f F (P) OB). 
dk čk 


246 SET FUNCTIONS. ABSOLUTE CONTINUITY [81 


On dividing by G(%,) and summing over k, we get 





S (P) = Ser < SPP) ada), (113) 
k &e 


so that we have for the strict upper bound of the sums written: 


lgl < f 7? (P)@ (a8). 


Comparison with (112) gives us 


lell = f? (P) G(d8). (114) 
Če 


We have shown that, for every y(@) of V, the function f(P) appear- 
ing in (111) belongs to L, and that (114) holds. Conversely, if we 
know that ọ(Z) can be expressed by (111), where f(P) € L, on Z» 
then it follows from (113), deduced solely on the basis of (111), that 
g(@) € Vy. Since the representation by (111) is unique, we can say 
that (114) in fact holds. We thus arrive at the following important 
theorem: 

THEOREM 1. The necessary and sufficient condition for o(@) to belong 
to V, on &, is that y(@) be expressible by (111), where f(P) € L, on Zo. 
If this condition is satisfied, (114) holds. 

Let us indicate another necessary and sufficient condition for 
g(@) to belong to V3. 

THEOREM 2. The necessary and sufficient condition for g(&) to belong 
to V, on & y is that there exists a non-negative function H(@), completely 
additive on g, such that 


p? (Z) < GF) He). (115) 


For, if this condition is fulfilled, the sums S(p) are bounded: 





80) = 3 Gey < ZEEE). 


Conversely, if g(%) € V2, (111) holds and f(P) € L} on Zo and 
hence on any subset of &, measurable with respect to G(Z%). We put 
H(Z) = f f° (P) G(d@). (116) 

č 


On applying Buniakowski’s inequality to (111), we get (115), and 
the theorem is thus proved. 


81] HELLINGER’S INTEGRALS 247 


If o(%) € Vz on Z» the strict upper bound of the sums S(p) is 
called a Hellinger integral and is denoted by the following symbol: 


_ _ p Pa) 117 
loll. = sup 8 (H) = | “Graz - pE 





Formula (114) now gives the transformation of a Hellinger to a 
Lebesgue integral: 





2 

Ee = [PP ads). ne) 
Če Če 

If the subdivision ô’ is a continuation of ô, and å is integral (117), 
i.e. the strict upper bound of the sums S(p), as we know, S(p) < 
< Ss(~) < i. On taking this into account, we can say that integral 
(117) has the following property: given any e > 0, there exists a 
subdivision 6, such that, for any continuation of it, 


|i — Sy (p)| <€(6’ is a continuation of d,). (119) 


Let us show that there can only be one 7 with this property. 
Suppose 7’ is a further number with the property. In addition to 
(119), we shall have | 7’ — S;(p) | < e, where 6; plays the role of 
6, in (119). On taking the subdivision ô; = 6,6;, we can write the 
two inequalities: 


|i — Sa (p)| < £ and |i’ — 5, (p) < £; (ô is a continuation of ô,). 
Since i’ — i = (¢’ — S3(¢)) + (Sa (p)— i), we obtain for these 6: 
|i’ — il < |ë — Sa (p) | + li — Sa (p) | < 2e, 


whence, since e is arbitrary, it follows that i = i. Let »(%) and 
pı(Z) € Vz; we consider the sum 


ë é 
Sy (P, P1) = y g aa (120) 
k 


which can obviously be written as 


1 č 1 (EW) 
k 








Ge,) 
1 P (Ex) 1 Fi (Ex) 
-72 G(E,) -TÈ EA ? (teh) 


i.o. 


1 
Sa (9.91) => Sa (P + pa) — -7 3o (P) — -7 So (Pa). 


248 SET FUNCTIONS. ABSOLUTE OONTINUITY [82 


We have property (119) for each of the sums on the right-hand 
side, where the 6 can be taken as the same, since different 6 can 
be replaced by their product. We therefore have property (119) for 
the sums S(p, pı); the corresponding ¢ for sums (120) is written as 








P(dé) p, (dé) 
= a an (122) 
On taking a into account, we can write 
[= (de) gı (dë) _ [g(dë) + pı (dë)? 1 f(d) __1 fede) 
CTN + Gde) 2 J Gdg) 2 Jada ’ 
g? če A & 
or, on taking (118) into account: 
(dé) p, (dé 
ea = J HP) f (P) G(as), (123) 
čı 
where /,(P) is ih point function of L, corresponding to 9,(&): 
m1(8) = f h (P) G8). (124) 
We can investigate in the same way more general sums of the form 
o= SuP,) Eo p e, (125) 
k 


where u(P) is bounded and measurable with respect to G(#), and 
Pk is any point of Zx. It can be shown that there exists for these 
sums a unique number ¿ with property (119), in which S has to 
be replaced by o,, the inequality in (119) being fulfilled for any 
choice of P,. This number 7 can be expressed by a Lebesgue-Stieltjes 
integral: 
= f u(P) f(P) f, (P) G2). (126) 
& 
Property (119) is at the basis of the general definition of the integral. 
The next sections give a more detailed treatment of the case of a 
single variable. 


82. The case of a single variable. We shall investigate this case by 
starting from a point function and considering the simplest 
case of continuous functions. The following notation will be used 
for brevity: if 4 is an interval [a, 8], the symbol Az(x) will denote 
the difference 7(8) — t(a). Let g(x) be a non-decreasing function 
continuous .on the finite interval [a,b], and F(x) a real function, 


82] THE CASE OF A SINGLE VARIABLE 249 


continuous on this interval, with the property that 4 F(x) = 0 if 
A giz) = 0. Let ô be a subdivision of [a, b] into a finite number of 
sub-intervals A, and 


(A, F)? 
S, =a aL, (127) 


The terms of the form 0/0 are assumed zero. This sum does not 
decrease on adding new points of subdivision [78]. Since we shall 
be making use of a division into sub-intervals, we need to prove the 
theorems analogous to those that we had for a division into sets 
measurable with respect to G(@). 

THEOREM 1. The necessary and sufficient condition for the set of 
values of the sums S, to be bounded is that there exists a non-decreasing 
function h(x), bounded on [a,b], such that 


(4 F)? < dg- Ah, (128) 


for any sub-interval of [a, b]. 

If this condition is satisfied, the terms of S, do not exceed 4, h 
and we have 8, < A(b) — h(a) for any subdivision. 

We now prove the necessity of (128). If the S, are bounded for 
the whole of [a, 5], they are all the more bounded for any part of 
[a, b]. Let h(x) denote the strict upper bound of S, for the interval 
a, z]: 
me h(x) = sup S, for the interval [a, æ}. 


It can be shown, along the same lines as when proving that the 
total variation is completely additive [8], that the strict upper bound 
of S, for any sub-interval [a, 8] is equal to h(f) — h(a) = Ah, so 
that A(x) is a non-decreasing and bounded function. If we do not 
divide A into sub-intervals, the sum S, for A reduces to the single 
term (4 F)?/4g, and it is less than the strict upper bound 4h of 
S, for A, ie. (4 F)?/4g < Ah, which amounts to (128). 

Let G(@) be the set function on [a,b], generated by the point function 
g(x), and let F(x) have the form 


)=f fe) G8) +0, (129) 
where f(x) € La. We have 
AF = § fla) G(a&) = § fw) dg(a), 
4 4 


250 SET FUNCTIONS, ABSOLUTE CONTINUITY [82 
and we obtain by applying Buniakowski’s inequality: 
(4F)} < IP (x) caa) f G(dé) = Ag(z) { Pe) G (dé), 
i.e. condition (128) is fulfilled, and 
hæ) = f f? (2) CA2), (130) 
where, since g(x) is continuous, it is unimportant whether [a, x] is 


closed or not. 
Formula (129) leads to the completely additive set function 


p(2) = f Ha) Gas), 
€ 


defined for sets that are measurable with respect to G(@) and belong 
to [a,b], where AF = 9(4). The sums 





> g? (Ek) (131) 
k 
have a strict upper bound ¢, given by (118): 


i= f Plz) Gas). (132) 
a 

Let us show that sums (127), corresponding to a division of [a, b] 
into sub-intervals only, have the same upper bound. This latter can 
never be greater than 7. We can use the absolute continuity of y(%) 
to show that integral (132) is in fact the strict upper bound of sums 
(127), which are obtained on splitting [a,b] into sub-intervals. 
By what has been said, given any € > 0, there exists a subdivision 6 

of [a, b] into measurable sets J, (k = 1, 2, ..., n), such that 


2G). , 
2 ae) 71” 185) 





where the 7 denotes integral (132). In view of the necessity of the 
condition in Theorem 1 of [37], we can say that, given any & p, there 
exists an elementary figure Rp, i.e. afinite sum of non-overlapping semi- 
open intervals, such that 


R, +e, =F, + ey (134) 
(k= 1,2,...,%), 


82] THE CASE OF A SINGLE VARIABLE 251 


where the measures of e, and e; are as small as desired. The sets &; 
have no points in common with each other, but R; may have common 
points on account of the ef, ie. R,R, C ekej, and the measures of 
these common parts are also as small as desired. In each of equations 
(184) we can refer to ep the part of 2, which is in common with the 
remaining R, this common part being the sum of a finite number 
of semi-open intervals. If 7 is the greatest of the measures of e; and 
ey, we shall have for the new ep with this transfer: G(e,) < (n + 1), 
since G(R,R,) < Gleje) < n. We can therefore assume that the Ry 
in equations (134) have no points in common with each other, and 
the measures of eg and eg are as small as desired. 

If we take into account (133) and the absolute continuity of ¢(&), 
and take the measures of ep and e; as sufficiently small, we can write 


n 2 
p (Rx) . 
2 TR) > t— 2e. 


Let A, (s = 1,2,...,m) be the sub-intervals appearing in the 
composition of all the R,. We obtain by taking account of (62): 





© p (As) . 
= GA) > i— 2e. (135) 

Since g(x) and F(Z) are continuous, we can regard the A, as closed 
or open intervals. These intervals do not need to cover [a,b]. On 
adding the non-negative terms corresponding to the remaining inter- 
vals, we get (135) all the more for the complete sum, and it follows, 
since € is arbitrary, that integral (132) is the strict upper bound of 
sums (127), given condition (129), where f(z) € L}. A further point: 
since g(x) is assumed continuous, the A(x) defined by (180) is con- 
tinuous. 

We now show that, if condition (128) is satisfied, i.e. sums (127) 
are bounded, F(x) is expressible by (129), where f(z) € L,. Let A be 
a sub-interval of [a,b] and Aj the remaining sub-intervals of some 
division of it. It follows from (128), by Buniakowski’s inequality: 


(JAF <(S VAAL! < > 4h 54kg, (136) 
k k k k 
i.e. 


(= Idi Ip < Ag: Ah. 


The same inequality holds for the strict upper bound of the sums on 
the left with any subdivisions, so that F(x) is a function of bounded 


252 SET FUNCTIONS, ABSOLUTE CONTINUITY [82 
variation and we have for its total variation v(x): 
(Av(2x))? < Ag Ak. (187) 


If A, are any desired non-overlapping intervals of [a, b], it follows 
from (136) that 


(514i Fl)? < 5 i gth(b) — h(a)]. 
k k 
If the sum on the right is < £, we have 
= |4; F| < Ve hb) — h(a), 


and it follows from this, since e is arbitrary, that F(x) is absolutely 
continuous with respect to g(x), i.e. 


F(x) = f fl) dg (z) + C. (138) 


It remains to show that f(x) € L}. We form the bounded function: 


_ n, if f(z) > n, 
zee < —n, 


fn (2) = f(x), if A)| <n; f,(n) 


and put 
= f fa (2) dg (2). (139) 
The functions f,(z) € L, and hence, by what has been proved: 
sup > ae (Ae) dg (2). (140) 


If integral (138) is taken over different sets Z of [a, b], which are 
measurable with respect to g(x), we get a set function whose total 
variation on [a@, x] is given by the integral [73]: 


= { fæ) dyla). (141) 


If we split [a, x] into intervals only, we get the same total variation 
for function (138) [74]. 
On taking (137) into account, we have 


(dko) 
UP È Ar =M, (142) 


83] PROPERTIES OF THE HELLINGER INTEGRAL 253 


where M is a finite number. On the other hand, by (139) and (141), 
we have | Ay Fn | < A, v, and, on taking (140) and (142) into account, 
we obtain for any n: 

b 

{ A(x) dg(z) < M, 

a 
whence it follows that f(z) € L,. The above discussion yields the follow- 
ing theorem: 

THEOREM 2. Condition (128) is equivalent to the fact that F(x) is ex- 
pressible in the form (138), where f(x) € L,, and if (128) is satisfied, the 
strict upper bound of sums (127) is given by the integral (132). 

Throughout the above, we do not need to assume that g(x) and F(z) 
are continuous. Without this assumption, we take the basic interval 
as semi-open and split it into semi-open intervals for which 4g = 
= 9(B + 0) — g(a + 0). All our results are retained, apart from the 
continuity of h(x). It may be mentioned that, as a consequence of 
condition (128) and the continuity of g(x), we can write (128) with 
continuous A(x). 


83. Properties of the Hellinger integral. The strict upper bound of 
sums (127) is the Hellinger integral, for which a similar notation to 
(117) is used: 


b 
[dF (x)]? 
dae? (143) 
and where we have the formula: 
b b 
[dF(x)]}? fp 
ae ml (2) dg (2). (144) 


Let us show that this integral is simply the limit of sums (127) on 
indefinite refinement of the sub-intervals 4p. 

THEOREM 3. If F(x) satisfies condition (128) and F (x) the analogous 
condition 


(AF,)*? < Ag Ah, (145) 
(g(x) is continuous), sum (127) and the sum 
, Ay FA, Fy 
> ag (146) 


k 


have a definite limit on indefinite refinement of the A,, the limit of (127) 
being equal to the strict upper bound of these sums, i.e. integral (143). 


254 SET FUNCTIONS. ABSOLUTE CONTINUITY [83 


As above, let 7 be the strict upper bound of sums (127). Given 
e > 0, there exists a fixed subdivision 6, such that S;, > i — e. Let 
6 be so fine a subdivision that every sub-interval of 6 contains not 
more than one point of subdivision of ô, and that the increment of 
the continuous function A(z) in each sub-interval of ô is not greater 
than e. We have for the subdivision 66): 


Sy, > Sy >i—eé. (147) 


If p is the number of points of subdivision in 6, not more than p 
sub-intervals of 6, are split into two on passing to 66,,and the correspond- 
ing non-negative term of S, is replaced by two non-negative terms of 
Sss, Each of these three terms is not greater than e, by what has been 
said regarding the increment of A(x) and property (128), so that 


On comparing with (147), we get Sa > i — (2p + 1)e, whence it 
follows, since e is arbitrary, that the sum (127) tends to 7 on indefinite 
refinement of the 4x. As regards sums (146), we notice that F(x) + 
+ F(x) is expressible by a formula of the type (138), where f(x) + 
+ f(x) € Z,, and the sums 


> [4 (F + Fo? 


T Akg 


like the analogous sums for F (x), have a limit on indefinite refinement 
of the 4,. It follows from this that the sum 








py Fah, 1 yy Meh + PY 
= Ang 2< Akg 

1 (A, F} 1 (Ay F}? 

Ee Ag e Ae 


k 


also has a limit. It coincides with integral (122). We thus obtain the 
following Hellinger integrals: 


b b 
(aF)? (d F)? . dFdF, } Ay F- Ay F, 
f“ dg T dee Akg > dg ace) Ang i 

a 


By what was said in [81]: 


(148) 





k 


arar, 


-f Hla) fı (2) dg (a 


83] PROPERTIES OF THE HELLINGER INTEGRAL 256 





where 
= Î f, (z) dg (2). (149) 
We can consider the more cad sums 
Zeo AT (150) 
and 
Fo ATAN, (151) 


where u(x) is continuous in [a, b], integrals (148) exist and &, is any 
point of 4,. These sums also have a definite limit on indefinite re- 
finement of the A,. It is sufficient to prove this for sums (150). We take 
the sums 





(4k F)? 
Sa (152) 
where m is the least value of the continuous function u(x) on the closed 
interval 4,. It can be shown, precisely as above, that the sums do not 
decrease on adding new points of subdivision, that they are bounded 
and have a definite limit on indefinite refinement of the 4x. By the 
uniform continuity of u(x) and condition (128), the difference between 
sums (150) and (152) tends to zero on indefinite refinement of the 4p, 80 
that sums (150) also have a definite limit. We thus obtain the following 
Hellinger integrals: 


b 
dF)? A Ak F} 
fue) <a = lim & u(x) r ) ; 





F kI 
5 (153) 
dF dF, Ay Fåg F, 
fue) a im Èu ulë) — Ag" 
a 


The theory given above can be extended to the case when g(z) is 
discontinuous. The sums indicated have a definite limit for a sequence 
of subdivisions 6,, regular in the sense of the general Stieltjes integral. 

All the above theory obviously still holds when F(z), F,(x) and u(x) 
are complex functions, where F(x) and F(x) must satisfy the con- 
ditions 

|AF| < Ag Ah; |AF,| < 4g Ah. 

The squares must be everywhere replaced by the square of the 

modulus, i.e. (4, F} by | 4, F P. 


256 SET FUNCTIONS. ABSOLUTE OONTINUITY [83 


Some further simple properties of the Hellinger integral may be 
noticed. Let ©(x) satisfy condition (128) and be non-decreasing. We 
form the function 


F(x) = f ula) d® (zx), (154) 


a 


where u(x) is continuous, and we consider the sums 
EA . (155) 
We apply the mean value theorem: 
A; F = f u(x) d® (x) = ulé,) A, ® 
ðk 


where é, € Ap. Sum (155) can be rewritten as 


(dk F} (ák P) 


Zaa TRO aa 


and we get in the limit: 











eS 


b 
En = fw (x) (d 
a 
Similarly, if we have along with (154): 
gz) = fan (x) d® 


where ®,(x) satisfies condition (128) and is non-decreasing, and if 
u,(z) is continuous, we obtain 








oP ORs j ae de, 
(= = fue u, (x age? 


(156) 

If P(x) and P(x) satisfy condition (128), but are not monotonic, we 
also get (156), by using the canonical form for ®(z) and ®,(z) as a 
difference between two non-decreasing functions. Notice that F(z) and 
F(x) obviously also satisfy condition (128). 


CHAPTER IV 


METRIC AND NORMED SPACES 


84. Metric space. The first part of this chapter will be devoted to 
the theory of certain abstract spaces, and will be followed by the 
application of this theory to various concrete spaces — primarily, to 
function spaces, i.e, sets of functions of a definite class. The same 
abstract space may take several different concrete forms, so that it is 
expedient to discuss abstract spaces. 

Every abstract space is a non-empty set of elements, which is 
subject to certain axioms. The nature of the elements is not defined, 
and the theory of any abstract space is a consequence of the axioms 
which define the space. For the sake of a connected exposition, we 
shall first give the full theory of abstract spaces, then later describe 
the application of this theory to the various concrete forms of the 
spaces. Let us start with the theory of so-called metric spaces. 

A set X of elements, which we shall denote by successive letters of 
the Roman alphabet (z, y, z etc.) is called a metric space if each pair 
of its elements 2, y is associated with a non-negative number (2, y) 
(the distance between x and y) such that the following conditions hold: 

1. elx, y) > 0, where the = sign holds when and only when a = y, 
i.e. x and y are the same element. 

2. o(y, z) = g(x, y) — the axiom of symmetry; (1) 

3. glz, z) < olx, y) + ely, z) — the triangle axiom. (2) 

These conditions must hold for any elements g, y, z € X. If Yis Y» 
.. -Ym are any elements of X, we obtain by repeated application 
of (2): 


Yi Ym) < (Yr Yo) + O(Y2» Ys) +++ + CYm—r Ym) - (21) 


Let zn (n = 1, 2, ...) be an infinite sequence of elements and let 
an element 2, exist such that g(£o, £n) > 0 as n— œ. We say that 
£ is the limit of the sequence z,, and write £, => Z, or lim 7, = £o. 
It may easily be seen that a sequence cannot have more than one limit. 
For, let za = £ and zn = Yo. It has to be shown that x) = yọ. We 


257 


258 METRIO AND NORMED SPAOES [84 


have by (2): e(£os Yo) < 0(%o, Ln) + O(%n, Yo). On indefinite increase 
of n the right-hand side tends to zero, and we obtain in the limit 
Olo Yo) < 0. But (xy, Yo) > 0, and it follows from these two in- 
equalities that 0(%, Yo) = 0, ie. zo = Yo If n= £a then ob- 
viously every infinite subsequence £n, = To. 

Let us show that o(z, y) is a continuous function of x and y, i.e. if 
In => Ly and Yn => Yo, then ofTr, Yn) > Olto, Yo). 

We can write by (2,): 


O (Ens Yn) < 8 (Ens To) + 8 (Los Yo) + E (Yor Yn) ; 


E (Lo, Yo) < @ (Tos En) + 0 (Ens Yn) + E (Yn Yo): 
whence 


(4 (Er Yn) —e (£o, Yo) < g (Er Xp) + g (Yo Yn) ; 


(4 (Xo, Yo) — g (Ens Yn) < (4 (£o Xn) + g (Yns Yo) , 
ie. 
|E (Zos Yo) — 0 (En Yn) | < 0 (Zos En) + 0 (Yor Yn)- 


As n —> œœ, the right-hand side tends to zero, whence it follows that 
(Ln, Yn) > O(Lo, Yo)- 

If a sequence x, has a limit (£n => x), given any ¢ > 0 there 
exists an N such that 


O(m Zn) <e for mandn>N. (3) 


This follows at once from g(£m, £n) < O(n, Lo) + ol£o, £n), the 
right-hand side of which tends to zero as m and n —> œ. But it does 
not follow from (3), on the basis of our axioms, that the sequence £n 
has a limit (Cauchy’s test for the existence of a limit is not sufficient). 
If we introduce the supplementary requirement that the existence of 
a limit of the sequence gn follows from (3), the metric space in question 
is called a complete metric space. 

Let U be a set of elements of the metric space. It is said to be 
bounded if an element xz, and a positive number A exist such that 
o(%, x) < A for all x of U. Let q, be any fixed element different from 
Ly. We have: 0(2,, 2) < e(%,, To) + (£o, x) and we obtain for elements 
of U: 0(x,, £) < e(t za) + A, where A is a positive number. Thus 
the choice of element 2, is of no importance in defining a bounded 
set U. It is easily shown that, if a sequence 2, has a limit, the set of 
elements z, is bounded. 


85] THE COMPLETION OF A METRIC SPACE 259 


We describe z, as the limiting element of a set U of elements of X 
if there exists a sequence 2, of elements of U such that £n = 2). 
A set U containing all its limiting elements is said to be closed. If U is 
not closed, and we associate with it all its limiting elements, the new 
set, which we write as U, is closed [31]. The passage from U to U is 
described as the closure of U. If U is closed, then U = U. If U is the 
empty set (contains no elements), U must also be reckoned empty. 
The set of elements satisfying the condition (x9, Œ) < R, where 2, is 
a fixed element and R is a positive number, is called an open sphere 
with centre 2, and radius R, whilst the sphere is closed in the case of 
O(%o, T) < R. 

It is easily shown, from the fact that (£, x) is a continuous function 
of x, that a closed sphere is a closed set in the sense of the above 
definition. 

Notice that any non-empty set U of X is also a metric space, if the 
same definition of e(x, y) as for the whole of X is retained for the 
elements of U. The space X may in fact consist of a finite number of 
elements. 

Let X and X’ be two metric spaces, and suppose a one-to-one cor- 
respondence can be established between their elements such that 
olz, y) = olz’, y’), where z and 2’, y and y’ are any corresponding 
elements of X and X’. In this case X and X’ are called isometric. 
It is meaningless to differentiate between isometric spaces, from the 
point of view of abstract theory. 


85. The completion of a metric space. A sequence x, of elements of 
of z is described as fundamental (or mutually convergent), if condition 
(3) is fulfilled for it. If X is not a complete space, not every fundamental 
sequence has a limit. Let us show that, in this case, new elements 
(sometimes termed ‘ideal elements”) can be associated with the space, 
with a corresponding extension of the concept of distance, in such a 
way that the space obtained is complete. 

We shall start by proving a lemma. 

LEMMA. If £n and yn are two fundamental sequences, the numerical 
sequence p(£n, Yn) has a limit. By (21), O(n, Yn) < ltn, Lm) + Olm, Ym) 
+ (Yms Yn), Whence o(Ln, Yn) — (Lm, Ym) < lEn, Em) + OlYm Yn)- 
On interchanging the subscripts m and n and using (1), we get 0(Zm, 
Ym) — Oln, Yn) < (Ln, Lm) + O(Ym Yn). It follows from the last two 
inequalities that 


10 (Ens Yn) — 0 (Em Ym) | < 8 (Ens Em) + 8 (Ym Yn) - 


260 METRIO AND NORMED SPAOES [85 


On indefinite increase of m and n, the right-hand side tends to zero, 
so that Cauchy’s test for the existence of a limit is fulfilled for the 
numerical sequence p(£n, Yn), which is what we wanted to prove. 

Let us classify all the fundamental sequences, viz. the fundamental 
sequences xz, and xy for which o(2,, £n) > 0 are all put into one class. 
If zn and zh, as also £n and ær, belong to one class, then x} and 27 
belong to one class, since o(£n, £n) > 0 and (£n, zn) > 0 imply, by(2,), 
that o(£h, xa) > 0. The limit of o(£n, Yn) for sequences xz, and Yn 
belonging to different classes will be non-zero (positive). It also follows 
from (2) that, if the sequence z, has a limit x, in X, any other sequence 
£n of the same class is also convergent and has the same limit zo. 
Since the distance is continuous, sequences belonging to different 
classes cannot have the same limit. These classes of fundamental 
sequences therefore fall into two types. We shall start by describing 
classes of the first type. Let x, be an element of X. We shall have a 
class of sequences having a limit equal to x9. This class includes for 
instance the sequence £n in which all the elements are equal to x}. 
Each x, will have its own class of sequences. Classes of the second 
type consist of sequences z, having no limit in X. If X is complete, 
there are no classes of the second type. Suppose that X is not a com- 
plete space. We now form a new metric space X, taking as its elements 
the above-mentioned classes of fundamental sequences of X. We still 
have to introduce into X the concept of distance and show that its 
three basic properties hold. Let tand y be two elements of X. We take 
any two sequences 2, and Yn from the classes of sequences correspond- 
ing to them and define o(z, y) by the formula 


g (f, y) a Ke e (Zr Yn) x (4) 

We show that the non-negative number o(z, y) is independent of 
the choice of x, and y, from the classes corresponding to ¢ and y. Let zp, 
and y} be any two sequences of these classes and 0’(%, y) = lim g(x}, yn). 


n— a 
We have to show that g'(ž, y) = o(%,y). By (21): @(£n, Yn) < 
< 8lTn, En) + lEn Yn) + (Yrs Yn)- 

On observing that 0(2,, £n) > 0, olYn: Yn) > O and passing to the 
limit in the inequality written, we get o(z, y) < e’(#, 9). Similarly, 
o'(ž, 9) < 0(%, 9), so that o'(ž, y) = o(%, y). Formula (4) thus defines 
elz, y) uniquely. Let us now verify the three basic properties. Obviously, 
ol, #) > 0. 


85] THE COMPLETION OF A METRIC SPACE 261 


(1) Let e(z, y) = 0, ie. o(£n, Yn) > 0. It follows from this that the 
sequences 2, and Yn are of the same class, i.e. = y. 

(2). The property e(z, y) = e(y, £) follows at once from Q(%p, Yn) = 
= (Yn, Zn). 

(3) We choose sequences £n, Yn and 2, from the classes of sequences 
corresponding to z, y and z. We have 


e (%, 2) = lim @ (£n, Zn) < 
< lim [e (£n Yn) + E (Ym Zn)] < e (Žž, 9) +09, 2). 


Let the class corresponding to « be of the first type, and let 2, be 
the limit of sequences of this class. We can identify the element % of 
X with the indicated element x, of X. The elements #, such that the 
corresponding classes are of the second type, are those elements of X 
which have not appeared in X. If and y are elements corresponding 
to classes of the first type, whilst z, and yy are the elements of X 
identified with them in accordance with the above, we can put £n = £o 
and Yn = Yq in (4) for any n, and obtain 


o (&, y) = nm 0 (£o Yo) = 0 (Xp; Yo) » 


i.e. the new distance is the same as the old one for elements belonging 
to X. If corresponds to a class of the first type (£, is the corresponding 
element of X), and y to a class of the second type, (4) gives 


o (x, 9) = lim e (£o, Yn)- 
Noo 


Let us further show that, if the sequence x, belongs to the class 
defining the element ž, then 0(%, £n) > 0 as n — œ. 
We have by definition 0(%, zp) = lim 0(%m, Yn). But, since 2, is 
m 


a fundamental sequence, (3) holds and o(%,%,) < e for n > N, ie. 
ol, zn) > 0 as n— œ. 

We now show that X is dense in X, i.e. if č is any element of X and 
é is any given positive number, there exists an element x of X such 
that o(%, x) < e. If the class corresponding to f is of the first type and 
& is identified with the element z, of X,given any £ > 0 we can put 
£ = %p, since p(Ž, £o) = o(To, £o) = 0. Let & not belong to X, and let 
a fundamental sequence z, correspond to it. We fix an m such that 
Oln, £m) < € for n > m, and show that we can put x = £m. In fact, 
plz, Xm) = lim gln, £m), and, since p(£n, £m) < £ for n > m, we get 


n= oo 


O(Z, Tm) < €. 


262 METRIO AND NORMED SPACES [85 


We now show that X is a complete space. Let z, be a fundamental 
sequence in X, i.e. o(%,,%m) < € for n and m > N. We have to show 
that there exists in X an element ž such that 0(%, Z,) > 0 as n —> ©. 
By what has been proved, given any n, there exists an element 2, of 
X such that 0(%p,, £n) < ln. It is easily seen that the sequence £n 
of elements of X is fundamental: 


0 (Zn Lm) < O (Er Lp) + E (Eps Lm) + OC (Ëm Tm) < 
St teln in). 


The sequence x, appears in a class defining some element # of X. 
Let us show that 09(%,%,)—> 0. This follows from the inequality 


o (&, n) < 0 (8, £n) + 0 (Em Bn) < E (Ë, 2n) + > 


and the fact that 0(%, x,) > 0, as we have seen above. The complete- 
ness of X is proved. 

We must now prove a theorem that the completion of a metric 
space X is unique. 

THEOREM. The completion of a space X, such that X is dense in the 
new space, is unique up to isometric spaces. 

Let Y be a complete metric space containing X, in which X is dense. 
We have to show that Y is isometric with X. It is naturally assumed 
here that the distance between two elements of Y belonging to X is the 
same as in X. Let y be an element of Y. Since X is dense in Y, there 
exists a sequence 2, of elements of X such that o(y, zn) —> 0 in Y, so 
that 2, is a fundamental sequence in Y and in X. A definite element z 
of X corresponds to this sequence. It is easily seen that % is independent 
of the choice of £n, the only important point being that o(y, £n) > 0. 
We put 2 in correspondence with the above-mentioned element y of Y. 
Now suppose that we have a definite element 2’ of X. We take some 
sequence of elements x, of X defining it. This sequence is fundamental 
in the complete space Y and hence defines an element y’ of Y. It may 
easily be seen that y’ does not depend on the choice of the sequence 
Zn, the only important point being that it defines x’. We set y’ in cor- 
respondence with x’. As is easily seen, we establish by this means a 
one-to-one correspondence between elements of Y and X. It remains 
to show that (x, x’) = gly, y’). 

This follows from the definition of o(@, £’) in X and the continuity of 
distance in Y: 

e (&, 2’) = lim g (En €n) = E (Y, y’). 


No 


85] THE COMPLETION OF A METRIC SPACE 263 


We have dwelt in detail on the completion of a metric space because 
this process plays an important role in applications of the theory 
of metric spaces, and enables us to confine the discussion to complete 
spaces. Let us give three simple examples. 

(1) Let X be the space of all real rational numbers g, y, Z, ..., 
distance being defined by the formula g(x, y) = |x — y |. Clearly, 
e(z, y) satisfies all three of the conditions in the definition of metric space. 
Let us take a sequence of real rational numbers z,. By Cauchy’s test, 
it must have a limit, though if this limit is an irrational number, the 
sequence has no limit in X, i.e. X is not a complete space. Completion 
of it implies bringing in all the irrational numbers, and the space x 
of all real numbers is complete. 

(2) Let us take the space C of all real functions z(t), y(t), z(t), ..., 
continuous in a finite interval [a,b], and let us define the distance 
e(z, y) by [14]: 

e(z, y) = max | x(t) — y (t) |. 
agtsgod 

This definition of 0(z, y) is easily shown to be permissible. The con- 
vergence 0(X, £n) > 0 is here the uniform convergence 2,(t) —> x(t) in 
the interval a < ¢t < b, and if | 2,(t) — £m(t) | —> 0 as n and m —> œ, 
there exists a continuous function x(t) such that x,(t)—> a(t) uni- 
formly [I; 144], i.e. C is a complete space. 

(3) We now take the space F of the same functions continuous in 
(a, b), but with a different definition of distance: 


b 1 
e(z, y) = | fO- yopa. (5) 


This is in fact permissible. We take the fundamental sequence 
{an(t)} of F: 


f æn (t) — £m (t) |? dt —> 0 as n and m->co. 
a 
It has a limit in the sense of metric (5) [56], but the limit function 
may be any function of L,, since continuous functions are everywhere 
dense in L, [60]. If the limit function is not equivalent to a continuous 
function, the sequence, fundamental in F, has no limit in F, i.e. the 
space F is not complete. Completion of it gives functions of Z,, not 
equivalent to continuous functions, and transforms F into Ly. 
Instead of functions of one variable, we could have considered the 
set of functions z(t,, t» ..., tn) of n variables, continuous in a bounded 
closed set of n-dimensional space. 


264 METRIC AND NORMED SPACES [86 


It must be pointed out again that, when completing an actual 
metric space, it is important to be able to interpret the actual meaning 
of the new elements obtained from the completion. In the last example 
these were functions of L, not equivalent to continuous functions. 
Another point: as we have seen above, the space L, can be considered 
on any measurable set. We have discussed the case of a bounded closed 
set because we started from the space F of continuous functions. 

Our next theorem holds in complete metric spaces. In future we 
shall write S(x, r) for an open sphere in X with centre z and radius r 
and S(x, r) for the closed sphere. 

THEOREM. Suppose we have a sequence of closed spheres §(2p, Tn) 
(n = 1, 2, ...) in the complete metric space X, such that each successive 
sphere belongs to the previous one and the radii r, > 0 as n —> œ. In this 
case there is a unique point belonging to all the S(2p, rp). 

By hypothesis, 8(%n+p, Tn+p) © S(tn, tm) (p > 0), so that 
(n+p, Ln) < 2rn for any e > 0, i.e. the sequence z, is fundamental, 
and has a limit, say 2), since X is complete. We take a fixed sphere 
8 (xn, Tn) and show that £o € S(2p, rn). 

In fact, all the elements of the sequence £n, Tny ..-, having Xp as 
limit, belong to S(ap, rn) by hypothesis, i.e. S(2p, rn) is a closed set, 
and £a € S(2p, Tn). 

Suppose now that there exists an element x belonging to all the 
S(2n, Tn). Let us show that x = xo. Since zy and xo belong to all the 
Älta, tn), we have o(%, To) < Olo Ln) + O(Ln, £o) < 2p. This in- 
equality gives in the limit (x9, £o) < 0, ie. (£o, 2) = 0, whence it 
follows that xg coincides with z,.The theorem is proved. A further point: 
every closed set U of a complete metric space X is also a complete metric 
space (with the assumption, obviously, that the distance g(x, y) in U 
is equal to the distance between x and y in X). All this follows from 
the fact that every sequence 2, fundamental in U, has a limit in X, 
and this limit must belong to U, since U is a closed set. 


86, Operators and functionals, The principle of compressed mappings. 
Let X and X’ be two given metric spaces. The correspondence x’ = Az, 
relating definite elements x’ of X’ to elements x of X, is called an 
operator acting from X into X’. An operator may not be defined in 
the whole of X. The set of elements x of X on which the operator A is 
defined is called the domain of definition of A and will be denoted here 
by D(A). We shall denote the set of values of Ax by R(A). This is a 


86] OPERATORS AND FUNCTIONALS 265 


set of elements of X’. If R(A) is the whole of X’, the equation 
oe Ax (6) 


has at least one solution for any x’ of X’. Suppose that A establishes 
a one-to-one correspondence between D and R, i.e. given different x 
of D(A), different x’ of R(A) are obtained by (6). In this case equation 
(6) has a unique solution of D(A) for every 2’ of R(A). 

Functionals represent a particular, but extremely important, case 
of operators. The latter are called functionals when X’ is the real 
number space, with the definition of distance of [85]: elz’, y’) = 
= | æ’ — y’ |. The space of all complex numbers is also occasionally 
taken, with the same definition of distance. 

The following test shows if the equation x — Az = 0 is uniquely 
soluble, in the case when X’ coincides with X. 

THEOREM. (The principle of compressed mappings.) If the operator A 
maps the complete metric space X onto itself, D(A) = X, and for any x 
and y of X: 

e (Ax, Ay) < ag (x, y), (7) 
where a is a number satisfying the condition 0 <a < 1, the equation 
x = Ax has one and only one solution. This solution can be obtained as 
the limit of the sequence 

La = Ax, Zy = Aw, % = Azy,..., (8) 


formed with an arbitrary choice of the initial element x. 
In the present case, D(A) = X and R(4)c X. We have 


g (Lp n41) =e (Az,-1, Atn) < ap (tni Ln). 


On applying the same inequality to 0(%,_,, %n) and so on, we get 
ltn, n41) < a" * p(x, £) (n = 1, 2, 3, ...), whence it follows that, 
with m > n: 

Q (Ens Lm) < e (En Tnt) T @ (Tra Xn+2) F a ae F (4 (Em-1> Em) < 


n-1 


<a (lpat... + a=) g (x, 2) <p 


ag @.(%, Za). 
On observing that 0(%, £n) = 0 and g(£n, Em) = O(m, In), We see 
that 0(%n, £m) > 0 as n and m — co. Since X is complete, the sequence 
Zn has a limit, which we write as £o (Zn => £o). Let us show that 
Az, > Aro: 
e (Atn AXo) < ag (Eni Zo) —> 0, 


since (Tni, 2») > 0. On passing to the limit in the equation z, = 
= At,.;, we get £o = Az,. It remains to show that the solution of 


266 METRIC AND NORMED SPACES [87 


x = Az is unique. Let z’ be a solution of this equation: z’ = Az’. 
We have to show that z’ = xz). We have (x5, x’) = 0(Ax,, Ax’) < 
< a0(Lq, T’), ie. (1 — a)gel£o z’) = 0, whence p(£o, x’) = 0, so that 
x coincides with x. The theorem is proved. 

Note. Let U be a closed set of X. If D(A) = U, R(A)c U and 
condition (7) is satisfied (0 < a < 1), the theorem holds, where 
x, E€ U and every 2’ satisfying x’ = Az’ coincides with x). We are 
assuming that x’ € U, since A is defined in U. 


87. Examples. Before turning to examples of the application of the principle 
of compressed mappings, some examples of complete metric spaces will be given. 
1. The space R, of all sequences of n real numbers. The distance be- 


tween elements x(a,,@,,...,@,) and y(b,,b,,...,6,) of R, is defined as 
follows: 
n 1 
(x,y) = |È (ak — both. (9) 
“1 


We can also define the complex space R,, of sequences of n complex numbers. 
In formula (9) for the distance we have to replace (ap — bp)? by | a, — bp |è. 
This replacement also applies for later examples of spaces of sequences or 
functions. 

2. The space m of jointly bounded infinite sequences of numbers 
L(a, dg, ...), ie. given any element x of m, there exists a positive number 
my such that |a;| < my for any i. 

The formula for g(x,y) is 


e (z, y) = sup | ap — bx |. (10) 


Convergence in m is equivalent to a coordinate convergence, uniform with 
respect to the number of the component. 
3. The space s of all infinite numerical sequences, where 
z | ay — bgl 
(21) = 2 oe 
fy 28 (1+ | ay — bxi) 
The proof that this definition of g(x,y) is permissible follows the same 
lines as the proof below for the analogous function space S. 
Convergence in space s, as in R,,, is equivalent to a coordinate convergence. 
4. The space lp (P > 1) of infinite sequences of complex numbers a, such 
that 


(11) 


Siu <+e, (12) 
where a 
(en) =| Žar. (13) 
=1 


The triangle rule is obtained from Minkovskii’s inequality for sums with 
p> 1 [62] and is obvious with p = 1. 


87] EXAMPLES 267 


5. The space C of functions g(x), where x is a point of n-dimensional space 
Rp that are continuous on a bounded closed set ë, where 


o (p, y) = max |g (x) — p(x) |. (14) 
xE 


6. The space M of functions p(x), defined on a (Lebesgue) measurable set 
é of R, Equivalent functions are identified, and every function of M is bounded 
(or equivalent to a bounded function). 

The definition of distance is 

elp, ¥) = inf sup |ọ(x)— y (x)|. (15) 
mE) =0 &~& 

The meaning of this definition is as follows: we exclude from ë a set &, of 
measure zero, define the strict upper bound of | p(x) — y(x) | on the remaining 
set € — &,, choose the set &, in all possible ways and find the strict lower bound 
of the resulting set of non-negative strict upper bounds sup | p(x) — y(x) |. 
Occasionally, (15) is written as g—& 


e (P, Y) = vrai max |ọ (x) — y (x) |. (16) 


If € is a bounded closed set, C is part of M and oa(9, y) is the same for D 
as for M, i.e. C is isometric to part of M. 
7. The space § of all functions p(x) measurable on a measurable point set 
č of finite measure of R, where 
|v (x) — ¥ (x)| 
, p) = |) der. 17 
0”) = jT eye] ne 


Here and in future, we understand the Lebesgue measure and integral in 
the corresponding R,,. 
8. The space L,(&) (p > 1) of functions p(x) measurable on the measurable 
set E and such that 
S\p@)Pde<+o, 
G 


where 


1 
e (p, y) = |f 19 (x) — y (2) P dz}? (18) 
é 


The distance o(y, y) has the properties mentioned in the axioms [62]. 

9. The space V{a, b] of functions g(x) of bounded variation on the closed 
interval a < x < b, continuous from the right at interior points of the interval 
and equal to zero at x = a, where 


b 
e (p, Y) = V |p (z) — } (x) |. (19) 
a 
If we dispense with the requirement g(a) = 0, o(p, y) is defined as follows: 
b 
e (p, v) = | p (a) — p (a) | + F |p (2) — y (z) |. (20) 
a 


This widens the space, and the original space is isometric with part of the 
widened space. 


268 METRIC AND NORMED SPACES [88 


All the above spaces are complete. 

We have proved that Lp and 1, are complete. The proof presents no difficulty 
in the other cases and will be omitted. 

Let us consider the space S in more detail. On observing that w(t) = 
= t(1 +t) = 1 — 1/(1 + t) is increasing for t > 0, we can write 


[¢+7| Es epee A | 7 | 
1+ |¢+7| 1+ |t] + |e} 1+ |¢] V+ |r]? 


whence follows the triangle axiom for S. We prove now that convergence in 
S is equivalent to convergence in measure. 

Let pn(x) — p(x) in measure on č. Let us show that (p, #n) > 0. We introduce 
the sets &,(6) = E[| p(x) — pníz) | > 6). By hypothesis, m[€,(6)] + 0 as 
n — œ with any fixed ô > 0. We have 





+ 





|p (x) — Pnr (x) | 
TEIE ae $ 


|p (z) — pn (2) | 
< f lar + J i+le@—s.@) °° 
č ‘n(3) ¢ —En( nO) 


e (P, Pn) = 


whence we obtain, since w(t) is BA and | g(x) — 9,(x) | < 6 on the set 
E — &,(4): 


ô 
e (Ps Pa) < M| En (O)| + ayy” 


Given «> 0, we can fix ô> 0 such that m(é) 6/(1 + 6) < e€/2. Further, 
an N exists such that m[é,(6)] < £/2 for n > N, so that elp, Øn) < e for 
n > N,i.0. o(p, pn) — 0. Now let o(p, pn) — 0, and let us show that p(z) + p(x) 
in measure on č. By what has been said regarding w(t), we have |ø(x) — 
— n(x) |/[L + | p2) — prle) |] > ô + ð), if æ € F,(d). Hence, 

e (Ps Pn) > f Tatr ET m&n (ô), 
En (ô) 
where ô> 0 is assumed fixed. By hypothesis, o(p, pn) — 0, and it follows 
from the last inequality that m[é,,(6)] — 0, which is what we needed to prove. 

By utilizing the theorem of [44] and what has been proved, we can say that, 
if e(p, Pn) — 0 in S, there exists a subsequence P(x} such that Pn,(£) > p(x) 
almost everywhere on &. The proof that S is complete is essentially the same 
as the proof for L, 

We could have used the Lebesgue-Stieltjes measure and integral in forming 
the function spaces Lp, M and S. 


88. Examples of applications of the principle of compressed mappings. 
1. We take the system of n equations with n unknowns: 


n 
ant a ay Èk + Oy, (21) 
=1 


(¢=1,2,...,2) 


88] EXAMPLES OF APPLICATIONS OF THE PRINCIPLE OF COMPRESSED MAPPINGS 269 


where A is a numerical parameter. We shall regard the right-hand sides as the 
operator Az from R, into R,,, applied to the element a(é,, &, ..., n) and acting 
in the whole of R,. We have from Cauchy’s inequality: 


e (Az, ay) < ial] 3 eal e (x, y). 


Thus the principle of compressed mappings will be applicable in R, if 


ai<[ 3 lout] 


ik=1 


2. We take the infinite system of equations 


&= Ad in Ëk + bi (22) 
=1 


Cae ae He: Ma 


where the sequence (b,, bz, ...) is regarded as an element of m. If 


sup 2 lal =¢ 
is a finite positive number, the ightnand sides of (22) yield an operator A 


from m into m, defined in the whole of m, and the principle of compressed 
mappings is applicable if | A|c < 1. If (bı, bz ...) is an element of l, and 


eo 
2 |%F=d< +2, 
ik=1 


the right-hand sides of (22) yield an operator from l, into l, defined in the whole 
of l, and the principle of compressed mappings is applicable if | A|d <1. 
Notice that the solution is unique in these spaces, but that solutions may exist 
that do not belong to the spaces. 
3. We take the integral equation (one-dimensional case) 
b 
p(x) =A K (a, t) p(t) dt +f (x), (23) 
a 
where [a, b] is a finite interval and K(z, t) is continuous in the square Q(a < 
< x< b;a< t< b). If f(x) is continuous on (a, b], the right-hand side of (23) 
is an operator from C [a, b] into C[a, b], defined on the whole of C, and the prin- 
ciple of compressed mappings is applicable to equation (23) if 
b 
|2| max f |K (x,t) |dt <1. 
a[XxSb a 
If K(a, t) € L, on Q and f(x) € L, on [a, b] (the interval may also be infinite), 
the right-hand side is an operator from L,[a, b] into L,[a, b], defined in the whole 
of L,[a, b], and the principle of compressed mappings is applicable to (23) if 


bb 
ial] ffiz rasa <1 


270 METRIC AND NORMED SPACES [88 


What has been said also holds for multi-dimensional integral equations. 
4. We take the non-linear integral equation: 


b 
gp (x) =å f K [z,t, p (t)] dt, (24) 


where [a, b]is afinite interval, K(x, t, z) is a continuous function of its arguments 
fora <x< b,a<i< band |z] <C, where C is a given positive number. 
Given any choice of function g(t), continuous for a < t < b and satisfying the 
condition | g(t) | < C, K[x, t, p(t)] is a continuous function of (x, ¢) in the above- 
mentioned square Q. Let | K(z,t,z)|<d for (x,t) €Q and |z|<C. If 
| 4] d(b — a) < C, the right-hand side of (24) is the operator Agp into C[a, b), 
for which D(A) is the sphere 0(0, p) < C, where 0 is the continuous function 
equal to zero in [a, b], and R(A) belongs to the same sphere. Notice that we can 
write 9(0,~) < C in the form | g(x) | < C. Suppose, further, that the kernel 
K(x, t, z) satisfies a Lipschitz condition with respect to the third argument, 
i.e. 


| K (x,t, z) — K (x,t, 2) | < N |z — zzl 
if (x,t) € Q, whilst |z, | and |2| < C. Now, 
e (Ag, AY) < | À| N (b — a) e (p, v), 


so that, when the conditions 
|a|d(b—a) <C and |A|N(b—a) <1 


are satisfied, the principle of compressd mappings is applicable to equation 
(24) in the above-mentioned sphere. This equation has a unique solution in 
the sphere, which can be obtained by the method of successive approximations 
with any choice of initial approximation 9,(x) in the sphere. This method gives 
a uniform convergence of the approximations to the solution in the interval 
[a, b]. 

5. Let D be a domain of three-dimensional space, bounded by the Lyapunov 
surface S. Let us take the boundary problem for the elliptic equation: 


Au — Af (x, y, 2, u) = 0 inside (D),' (25) 
u|,=0, (26 ) 


where A is the Laplace operator. We assume that f(z, y, z, u) is continuous in 
the four-dimensional closed domain of space (x, y, z,u) corresponding to the 
variation of (x, y, z) in the closed domain D with | u | < c, and has continuous 
derivatives with respect to its arguments inside this domain, the derivatives 
being continuous as far as the boundary. We suppose further that | f(x, y, z, u)|< 
< d for (x, y, z) € D and |u| < c and 


|$ (£, Y, 2, ty) — Í (a, Y, Z, Ue) | < N ju, — uz] 


with the conditions indicated (| u, | and |u| < c). Let G(x, y, z; £, n, ¢) be 
Green’s function for the Laplace operator for the domain D with boundary 
condition (26) [IV; 220]. We introduce the points P(x, y, z) and Q(&, 7, ¢) of 


89] COMPACTNESS 271 


D. The solution of problem (25) and (26) is equivalent to the solution of the 
integral equation 


u (P) = å [G (P; Q) f(Q; u (Q)]dzg (27) 
D 


in the space O(D) of functions u(Q) continuous in D [IV; 224]. We know that 
G(P;Q) > 0 in D [IV; 221], and that there exists a finite 


max fa (P; Q) dtg =Q,. 
P&D D 
If |å |@,d < c, the right-hand side of (27) is the operator into C(D) for 
which D(A) is the sphere 0(0, u) < ¢ in C(D) (i.e. | u(P) | < c in D) and R(A) 
belongs to this sphere. The principle of compressed mappings is applicable to 
the equation (27) if | å] NG, < 1. Thus, when the conditions 


lalaa <c and [a|wa, <1 


are satisfied, problem (25) and (26) has a unique solution in the sphere | wu(P)|< 
< c. It can be obtained by the method of successive approximations, applied 
to (27) with any choice of initial approximation from the sphere, and the 
approximations are uniformly convergent to the solution in D. 


89. Compactness. The idea of compactness has already been intro 
duced for a particular case [IV; 36]. We now discuss this concept for 
a general metric space X. A set U of elements of X is said to be com- 
pact in space X or simply compact, if any sequence of elements £n 
contains a convergent subsequence. If, in addition, U is closed, it is 
said to be mutually compact. 

It may easily be seen that a necessary condition for the set U to be 
compact is that it be bounded. For, if U is unbounded, there exists a 
sequence £n of U such that o(a, £n) > + œ, where a is any fixed 
element. It is impossible to extract a convergent subsequence from 
this sequence, since every convergent sequence is bounded. The con- 
dition that U be bounded is also sufficient for it to be compact in En 
[IV; 15]. This is not true in the general case of a metric space, and we 
next establish the necessary and sufficient condition for compactness. 

We must first introduce a new concept. We shall say that a set U 
has a finite e net, where e is a given positive number, if there exists a 
finite set z, (k = 1, 2, ..., 1) of elements of X such that, given any x 
of U, an element x, of our set can be found such that g(x, £) < €. 
Notice that the elements x, may not belong to U. 

THEOREM. The necessary and sufficient condition for a set U of elements 
of a complete metric space to be compact is that, given any € > 0, it has 
a finite e net. 


272 METRIO AND NORMED SPACES [89 


Necessity. Suppose that, for some £ > 0, there is no finite €, net 
for U, and let us show that U is not compact. We take some element 
x, € U. It can be asserted that there is an element x, € U such that 
0(%,, £2) > &. For we should otherwise have @(%,, £) < €ọ for any 
x € U, and an element x, would give an € net for U. Further, an 
element x, exists such that @(z;, 23) > €p (¢ = 1, 2), since otherwise 
elements x, and x, would give an £, net for U, and so on. 

We thus obtain an infinite sequence of elements z, of U such that 
O(Xp; £a) > eq for all p # q. Given any subsequence gn, (k = 1, 2, ...) 
we shall also have @(%p,, £n) > Eo for ngk # 1, so that no subsequence 
of x, can be convergent, i.e. U is not compact. 

Sufficiency. Suppose that U has a finite € net for any € > 0, and 
let x, be a sequence of elements of U. 

We have to show that a convergent subsequence can be extracted 
from it. If the elements 2, coincide with the same element y for an 
infinite number of values of n, y, y, y, ... is a convergent subsequence. 
Suppose that we do not have this situation. Now, if we leave in the 
sequence zr only one element of the group of equal elements (say the 
one with the least index), we get a sequence of different elements. 
Tt can be assumed that the original sequence already has this property. 
We fix some positive number e. Since U has a finite €/2 net, there 
exists a finite number of closed spheres of radius «/2 such that all the 
elements of U, and hence all the z,, belong to these spheres. An infinite 
set of x, belongs to at least one of them. 

Let us denote one of these spheres by S,(¢/2). Further, there exists 
a finite number of spheres of radius «/2?, to which all the z, of S,(e/2) 
belong. Of these, we take the sphere S,(¢/2?), which contains an infinite 
set of the elements in question. Similarly, there exists a sphere S,(«/2°) 
of radius ¢/2*, containing an infinite set of elements z, belonging 
simultaneously to S,(¢/2) and S,(e/2?). On continuing in this way, we 
get an infinite sequence of closed spheres 9,(¢/2") such that the radius 
of S,(e/2") is «/2*, and S,(e¢/2*) contains an infinite set of elements zp, 
belonging simultaneously to all the spheres S,,(e/2”), where m < k. 
We take an element zn, from each of the spheres S;(¢/ 2"), where it can 
be assumed that n; > ng for 1 > k. We thus obtain an infinite sub- 
sequence Tr, of the sequence zr. On observing that, by the triangle 
axiom, we have g(x, y) < 2r for any two elements x and y belonging 
to the same sphere of radius 7, we can say that 


€ 
0 (Enp Er) < ya for n, > Ny. 


90] COMPACTNESS IN C 273 


Hence it follows, since the space is complete, that x,, is a convergent 
sequence. The theorem is proved. 

Note 1. It is sufficient for compactness that there exists merely 
@ compact, and not a finite, € net for any € > 0. This means that, 
given any e€ > 0, there exist spheres of radius £, containing all the 
elements of U, the centres of which form a compact set. Let U, de- 
note this set of centres. By the theorem (necessity), there exists for U, 
a finite £ net, and it follows at once from the triangle axiom that this 
net will be a finite 2e net for U, whence it follows, in view of the 
theorem (sufficiency) and the fact that « is arbitrary, that U is a 
compact set. 

Note 2. Notice that U can coincide with X, so that we can speak 
of the compactness of the whole of X. It is easily shown, from the fact 
that distance is continuous, that every compact space is complete. 
Thus every set U compact in itself of elements of X is a complete 
metric space. 


90. Compactness in C. Let C be the space of functions continuous 
in the finite interval (a, b], and U some set of elements of C. We have 
seen that the sufficient conditions for U to be compact are that the 
functions of U be bounded and equicontinuous [IV; 16]. Let us show 
that these conditions are necessary. Let U be compact. By our theorem, 
given any « > 0, there exists a finite number of functions 9,(¢), p(t), 
- - +, @,(t) of C such that, for any function g(t) of U, we have | y(t) — 
— plt) | < £/3 fora < t < b, where p,(t) is one of the above-mentioned 
functions. Since there is a finite number of such functions, there exists 
for all of them a positive ņ such that 


| Pr (t + h) — P (é)| < $ for]A| <n (k = 1, 2,..., p), 
(t and t+ h€[a, }]) 


where 7 depends only on e. 
We get from this: 


jpt th — olt) <|e+h) — gpt +h] + 
+ p.(é+h) — gpt] + let) — p tl. 


When | h| < 7, all the terms on the right-hand side are < e€/3, so 
that | p(t + 2) — w(t) | < £ for | k| < n, and the equicontinuity of 
the functions of U is proved. The fact that they are bounded is an 


274 METRIC AND NORMED SPACES [91 


immediate consequence of the fact that compactness is a necessary 
condition for U to be bounded [89]. This criterion for compactness is 
proved in precisely the same way for functions of several variables, 
defined in a finite closed domain of R,. If the functions are defined 
on a bounded closed set, the proof is the same in essence. 


91. Compactness in Lp. Let us take the space LZ, of functions 
p(x, y) on some measurable bounded set Z of the (x, y) plane. We shall 
assume in future that all these functions are continued by zero outside 
& and that integration is carried out over the whole plane. The integrals 
in fact reduce to integrals over a bounded measurable set. 

THEOREM. The necessary and sufficient condition for a set U of elements 
of Lp to be compact is that all the functions g(x, y) of U satisfy the following 
two conditions: 

1. There exists a C > 0 such that 


le || = [S |p (x,y) |P dady]> <C (boundedness), (28) 
F 


where || ¢ || is the notation for the left-hand side of (28). This quantity 
is called the norm of g(x, y) in Lp on & [62]. 

2. Given any € > 0, there exists an 7 > 0, the same for all g(x, y) 
of U, such that 


Il Sax Pl] = [f] p(x + h, y + k)— g(x,y) P dzdy}p <e 
for Vh? +k? < y. (29) 


We know that, given e > 0, there exists for every fixed function of 
L; an > 0 such that (29) holds (continuity in the mean) [70]. This 
is also obviously true for a finite number of functions p(x, y) (k = 1, 
2,..., n) of Lp. It is sufficient to take the least of the 7 corresponding 
to the p(x, y). Property (29), which must hold for all the p(x, y) of U, 
may be termed equicontinuity in the mean of all functions of U. 
Notice also that g(x + h, y + k) is a measurable function and that 
glx + h, y + k) — p(x, y) = 0 outside some bounded measurable set. 

The necessity of the conditions is proved in the same way as for C. 
The only change is in taking || » — y || = elp, y) in Lp in place of the 
absolute value of p — y. For, boundedness (28), which can be written 
in the form o(0, y) < C, as we know, is necessary for compactness. 
Further, the compactness implies the existence of a finite e/3 net, i.e. 
a finite number of functions y,(x, y) (k = 1, 2, ..., n) of Lp such that, 


91] COMPACTNESS IN Lp 275 


given any p(z, y) € U, there exists a y(x, y) such that || p — ys || < 
< ¢/3. There exists for all the »,(z, y) an 7 > 0 such that 


Sn Yell <q for YE < q. (30) 
We can also write: 
p(x + hy +k) — piz, y)|< lp +h y+ k) — p (s+ h y+ klt 
+ | Ps (£ + hey + k) — ps (2Y) | + | Ps (2, y) — plz, y) | 

and, in view of (30) and || p — ys || < ¢/8, we get (29) on applying 
Minkovskii’s inequality with p > 1. When p = 1, (29) is obtained at 
once. 

Let us now prove the sufficiency of conditions (28) and (29). Let 
| p(x, y) denote the mean function for p(x, y) [71]. We recall (178) of 
(71]. Given any p > 1, it takes the form 


(9j 
lo- el< f [flea -— p E+u nto P dgdn] dudo, (31) 
PERET 
where C, is a constant. By condition (29), given any € > 0, there 
exists an 7 > 0, the same for all p(x, y) € U, such that 


fle) — p (E+ mnt o) Pdédy < for ut + v < n. 
and inequality (31) gives us, for any g(x, y) € U: 
[P — || < £ fore <4. (32) 

The norm on the left is taken over the entire plane %., (actually, 
over a bounded set). All the more, || p — Pe lly < c fore < n. Having 
fixed ọ < 7, we can say that the functions g(x, y) form an e net for 
the set U of functions g(x, y). Let A (a < æ < b; c < y < d) be an 
interval containing 2. By condition (28) and Theorem 3 of [71], we 
can say that the set ,(z, y) is compact in C on J, and all the more, 
compact in Lp on &. Hence the functions 9,(z, y) form a compact e 
net for U, and we can say, since e is arbitrary, that the set U is com- 
pact. The sufficiency of (28) and (29) is proved. 

We now take the case when @ is the entire plane g.. The above 
proof now loses its force, since the set of functions p (x, y) may be non- 
compact. The necessity of conditions (28) and (29) for compactness is 
proved as above. But these conditions are not sufficient. We have 
to add a further condition, viz: given any € > 0, there exists a positive 
N, the same for all g(x, y) of U, such that 

J |p (æy) |P dedy < æ, (33) 


Ex — Ay 


276 METRIC AND NORMED SPACES [91 


where Am is the interval (—m < z < m; —m < y < m). Notice that, 
if condition (33) is fulfilled for a certain N, it is preserved when N 
increases. 

Let us prove the necessity of condition (33). If U is a compact set, 
it has an e/2 net of functions g(x, y) (k = 1, 2, ..., n). Condition (33) 
is satisfied for each of these functions, and, since there is a finite 
number of them, given any € > 0, there exists an N such that 


eP 
| inle pPderdy< 5 (k=1,%..., n). (34) 
€a—Ay 
We take some g(x, y) € U. There exists a p(x, y) such that || @ — 
— ps |len < ¢/2. It follows from 


| p lle0—4y < I P — Ps lle -4x + i Ps lleo-4y> 
together with (34) and || p — ps lle -ay < IP — Ps lle, < €/2, that 


1 
lole- =| f| |e (e.y) Pdxdyl? <> +5 =e, 
fo—dy 
which proves (33) for any g(x, y) € U. 

Let us now prove the sufficiency of (28), (29) and (33). Let these 
conditions be satisfied for functions g(x,y) € U, and let g(x, y), 
p(x, Y), ... be any sequence of functions of U. We have to show that 
we can extract from it a subsequence which is convergent in Lp on Ë œ. 
It follows from (28) and (29) that a subsequence g(a z, Y), P(x, Y), 

.., can be found which is convergent in Lp on 4. We can extract 
from this latter a new subsequence p(z, D oa, y), ---, Which is 
convergent in Lp on A,, and so on. We form the subsequence 


gn (x, y), Gn? (ae, Y), Gn (2, Y), +s (35) 


which is a subsequence of the original sequence g(x, y). If m is any 
positive integer, all the terms of sequence (35), as from P(x Y), 
belong to the sequence g(a, y), g(a, Y), ..., which is convergent 
in L, on Am, i.e. sequence (35) is convergent in L, on any finite interval 


Am a = 1, 2, ...). We show that it is convergent in Lp on &,, also. 
We consider the integral: 
LPP — pnt? IIB, = S | PaP (x, y) — nll? (x, y) |P de dy + 
Am 


+ f |g (x, y) — pnb? (x, y) P dz dy. 


Ë o — Am 


92] COMPACTNESS IN lp 277 


On using the obvious inequality |æ +y < 2? |v]? + 2?|y/|?, 
we get 


I] png? — pal? ||P. < 1 | pal? (a, y) — PrP (x, y) P dz dy + 


+2 È | pri? (x,y) p dedy +2 È | onl) (x,y) |P dedy. 
Fo —4ny E o— Am 
By condition (33), given any £ > 0, there exists an m such that the 
sum of the terms on the right-hand side, apart from the first, is less 
that ¢?/2. Having fixed such an m, we get 


| pal? — pP lpo < l | ni? (a, y) — pri? (x,y) P da dy + TA 


But it follows from the convergence of sequence (35) in Lp on Am 
that the integral on the right is not greater than «?/2 for all sufficient- 
ly large g and r, and hence there exists an M such that 


|| pr — pri? ||2 <e for q and r>M. 


i.e. sequence (35) is mutually convergent in L,(@..), and, since Lp(a) 
is complete, this sequence has a limit in L,(@.,). The sufficiency of 
conditions (28), (29) and (33) is proved. It easily shown that the Jast 
condition is not a consequence of the first two. 

We have considered the case of a plane for definiteness. Everything 
said obviously also holds in any space Rn. On writing x(2,, 2, ..-, 2n) 
for a point of this space and introducing the notation dz = da, da, 

. d£n, we can write (28) and (29) as 


1 
[Sly (x)|? dz}? < C (36) 
é 


i 
[Sle@ty)— (a)? |dx] <e for |y| <n, (37) 


č 


where y has components (Y}, Y2,---, Yn) and |y | = Vy? + yè + ...+y?. 





92. Compactness in /,. Let us prove the following theorem: 

THEOREM. The necessary and sufficient condition for a set U of elements 
of lp (p > 1) to be compact is that all the elements x(&,, ,,...) of U 
satisfy the following two conditions: 

1. There exists a number C > 0 such that 


æ 1 
|æ] = (z |&jPP <C (boundedness). (38) 


278 METRIC AND NORMED SPACES [92 


2. Given any « > 0, there exists a positive integer n, the same for 
all v € U, such that 


2 4 
(Zlé P) <e. (39) 


Necessity. As we know, boundedness (38) is a necessary condition 
for compactness. Further, the compactness of U implies the existence 


of a finite number of elements x, (k = 1, 2, ..., m) of l, such that we 
have for any x € U: p(x, xs) = || x — zs || < 6/2, where z, is one of 
the elements zy. There exists for the elements a; (£4, &, .. .), since the 
number of them is finite, a positive integer n, such that 
oo 1 
= E 
(DIEPPE: (40) 
l=ng 
(k=1,2,..., m) 


But it follows from || æ — zs || < ¢/2 that 


bins 1 es i E 
( lé — BP p< (lå — |P)? < 7’ 
l=1 


=Ne 


and we obtain, on applying Minkovskii’s inequality for sums (p > 1): 
(Zias (Slim Pye + SPP <s 
l=ng l=ng l=ng 


which proves that condition (39) is necessary. The proof makes no 
use of Minkovskii’s inequality when p = 1. 

Sufficiency. Suppose that the elements of U satisfy conditions 
(38) and (39); let us prove that U is compact. Given £ > 0, we associate 
with each element (&,, &,...) of U a cut-off element (&, &,,..., 
é,,-1, 0, 0, ...), and write U, for the set of these cut-off elements. 
It follows from (39) that there exists for any æ € U an element y € U, 
such that ||% — y || < e, ie. the set U, is an e net for U. It remains 
to show that U, is a compact set [89]. The proof is analogous to the 
proof that every set bounded in Ep is compact. 

By (38), we have | s | < C for any component of the elements U of 
U,. We can extract from any sequence of elements of U, a subsequence 
for which the first (n, — 1) components have a finite limit. The re- 
maining components of these elements are zero, whence it follows that 
the subsequence in question is convergent in J, to an element for which 
all the components £; are zero for $ > n, The compactness of U, is 
therefore proved. It follows from the present theorem that the sphere 
|| [| <7 in l, is non-compact. 


93] FUNCTIONALS ON MUTUALLY COMPACT SETS 279 


93. Functionals on mutually compact sets. Let the functional L(x), 
which takes real values, be defined on the mutually compact set U of 
the metric space X. It is said to be continuous if £n = x, implies 
Up) > Ux). 

A theorem holds for such functionals, analogous to the theorem on 
continuous functions on bounded closed sets of space Rp. 

THEOREM 1. If U is a mutually compact set of space X and I(x) is a 
real continuous functional on U, it is bounded and attains its strict lower 
and upper bounds on U. 

We shall only prove that I(x) is bounded from below and attains its 
strict lower bound. The boundedness is proved by reductio ad absurd- 
um. If the set of values of [(x) were not bounded from below, there 
would exist a sequence of elements x, of U such that U(x,) —--— co. Since 
U is compact, we can extract from the z, a convergent subsequence 
In, => Vo and in view of the mutual compactness, £o € U. Now, 
since I(x) is continuous, we have l(z,,) > U(x), which contradicts 
U(2,,) > —°°, since U(x») is a finite number. 

Let a be the strict lower bound of the set of values of I(x) on U. 
There now exists a sequence of elements z, € U such that a < l(æn) < 
< a + 1/n. As above, we can assume that £n, => £o where To € U, 
and consequently l(xn,) > U(x). But it follows from a < (zn) < a + 
+ Ifn, that U(2,,)—> a, whence U(x)) = a, which is what we wished 
to prove. 

We introduced above the concepts of lower and upper limits of real 
number sequences a, (n = I, 2, ...). Let us bring in the notation for 
these: 


S = lima,; T = lima,. 


These limits may be equal to +% or — œ. 

If a sequence a, has a limit, S and T coincide with this limit. In 
addition, it follows from the definition of S and T that no subsequence 
an, of the sequence a, can have a limit which is less than S or greater 
than T, but there is at least one subsequence which has the limit 8, 
and one with the limit 7. The functional I(x) is said to be semi-con- 
tinuous from below on U if z, => x, implies that lim U(2z,) > Ux»), 
and is semi-continuous from above if x, => £o implies lim U(z,) < U(x»). 

Let us prove a generalization of Theorem 1, important in applications. 

THEOREM 2. A functional I(x), defined on a mutually compact set U of 
a metric space and semi-continuous from below (above), is bounded from 
below (above) and attains on U its strict lower (upper) bound. We take 


280 METRIC AND NORMED SPACES (94 


a functional semi-continuous from below, and prove as in Theorem 1 
that it is bounded from below. The assumption that l(z,) —> — oo 
leads to a subsequence 2,,=> 2), where z€ U and (zn) > — œ. 
But, in view of the semi-continuity from below, lim U(z,,) > (£o), 
where U(x.) is a finite number, which contradicts U(z,,) > —co. 

Let a be the strict lower bound of the set of values of I(x) on U. As 
in the proof of Theorem 1, we get a subsequence £n, => £o anda < 
< U(%p,) < a + 1/n,. It follows from the first that lim U(x,,) > U(x»), 
and from the second: lim U(z,,) = a, whence I(x.) < a. But a is the 
strict lower bound of values of I(x), so that k(x) = a, which is what we 
set out to prove. 


94. Separability. A metric space X, containing an infinite set of 
elements, is called separable if there exists a denumerable set of elem- 
ents of X: %,,%, ..., dense in X, ie. for any x € X and any e > 0, 
there is an element x, of the set mentioned such that g(x, x) < e. 

We proved above the separability of l, and Lp (p > 1) [59, 60]. In 
space C, the set of all polynomials with rational coefficients is an 
example of the denumerable set. An example in space Rp is the set of 
elements (@,,@, ...,@n), for which all the a, are rational or (in the 
case of a complex space) have the form a, = a, + ifr, where a, and k 
are real rational numbers. 

In space s, the denumerable set is the set of elements of the form 
(dis Qz, -- +, An, 0, 0, ...), where all the a, are rational numbers. 

Let us show that space m is not separable. We take the set U of 
different elements z(@,, a,, ...) of m such that the a, are zero or unity. 
On assuming that a, is the kth digit after the decimal point of a number, 
written on a numeration system to base 2, we see that the set U is 
non-denumerable. On taking into account what was said in [1], it is 
easily seen that it has the power of a continuum. We have g(x, y) = 1 
for any two distinct elements x and y of U. Let space m be separable, 
i.e. there exists a denumerable set x, (k = 1, 2, ...) of elements of m, 
dense in m, and let S, be the sphere with centre x, and radius 1/3. 
The set of these spheres is denumerable, and at least one of them 
belongs to more than one element of U. Let y and z be distinct elements 
of U, lying in the same sphere. We have: o(y, 2) < 21/3, which contra- 
dicts g(x, y) = 1, and proves that m is not separable. 

THEOREM. Every set U of elements of a separable space X is separable. 

We have to prove the existence of a finite or denumerable set of 
elements of U, dense in U. Since X is separable, there exists a de- 


95] LINEAR NORMED SPACES 281 


numerable set of elements 2, (n = 1, 2, ...) of X, dense in X. Let 
S(an, 7) denote the sphere with centre x and radius r. We consider the 
spheres S(zp, 1/2") (k = 1, 2, ...), and, if a sphere contains elements 
of U, we choose one of these elements. We obtain in this way a finite 
or denumerable set um (m = 1, 2, ...) of elements of U. Let u be any 
element of U and e€ a given positive number. We must show that 
|| u — um || < £ for at least one of the um. We can assume e < 1, so 
that there-exists a positive integer l such that 
1 1 
gr = © Sora 

Since the set of xz, is dense in X, there exists an n = na such that 
|| u — tn, I| < 2/4 < 1/2', whence it follows that the sphere S(2p,, 1/2') 
contains elements of U. Let u, be the element that we have chosen 
from this sphere (it may not in fact coincide with u). Since u and un € 
€ S(ap,, 1/2"), we have || u — un || < 1/27} < e, which is what we 
wanted to prove. 


95, Linear normed spaces. We shall now introduce abstract spaces 
which are metric but also have certain other properties. As above, we 
shall denote elements of a space by the last letters of the alphabet 
£, Y, Z, ..., and numbers by the first letters a, b, c, ... These numbers 
may be regarded as either real or complex. In the first case we have a 
real space, and in the second a complex. We shall in future consider 
complex spaces, unless there is some special proviso. 

The set X of elements zg, y,z,... is called a linear space if its 
elements satisfy the following axioms. 

Axiom A. Elements of X can be multiplied by a number and added, i.e. 
if x and y are elements of X and a is a number, ax and x + y are also 
definite elements of X. 

The operations mentioned are subject to the following laws: 


(1) @+y=y+2; (2) e+ (y + 2z)=(x+y)+z; 
(3) a(x a (4) (a + b) x = ax + bz; 
(5) a (bx) = (ab) x (6) la=2; 


(7) if pee ae then y=z. 


We must introduce the concept of zero element. Let x and y be any 
two elements of H. We shall prove that Ox = Oy. Let us write Or = 0 
and Oy = @,. We can write, on using laws 4 and 6: 


e+-6=1#+07=(1+0)tr=le=2 


282 METRIC AND NORMED SPACES [95 


and similarly, y + 6, = y. Further, we have from laws 1 and 2: 
(@+y)+O0=(e¢+4+y=2+y, 


and similarly: (27 + y) + 0, = x + y, whence it follows that (x + y) + 
+0 = (x+ y) + 8, and we obtain by (7): 0 = 6,. Thus multipli- 
cation of any element by the number 0 gives us the same element, 
which we call the zero element, and denote by the symbol 8. The follow- 
ing simple corollaries of the above laws are easily verified. Given any 
complex a, a0 = 6. If ax = 0 anda #0, then x = @. If az = bx and 
x +0, then a= bd. If av = ay and a#0, then x= y. We denote 
(—1)z by the symbol (—z). The difference x — y is defined by 


x—y=s+(—y) 


It is easily verified that the ordinary rules of algebra also hold for a 
difference. We shall in future simply write 0 for the zero element. This 
will not cause confusion with the number 0 if proper attention is paid 
to our later equations. If one side of an equation is an element of X, 
and the other side contains 0, this must be regarded as the zero 
element of X. 

DEFINITION. The elements £4, 2, .. ., Em are said to be linearly inde- 
pendent if the equation 


Ci Zi F Cz ta F -F Cm Em =O 


can hold when and only when all the numbers c, (k = 1, 2, ..., m) are 
zero. 

For the n-dimensional complex space considered in Volume II], the 
maximum number of linearly independent elements is equal to n. An 
axiom is sometimes introduced which excludes the possibility of a 
finite-dimensional space. 

Axiom B. Given any positive integer n, there exist n linearly inde- 
pendent elements. This will not play an important role below. Let us 
introduce one further axiom. 

Axiom C. Each element x has associated with it a definite real non- 
negative number || x ||, the norm of the element, and this norm must 
satisfy the following three conditions: 


(1) ||@||}=0 and ||z||>0, for «#6, 
(2) z+ yl < lell + Ily] 


(3) jarl = |e- læ], 


95] LINEAR NORMED SPACES 283 


where a is any number and | a | is the modulus of a. It follows from the 
second and third properties of the norm that || —z || = || æ ||, and 


Iz= yll > lel iyii e= yil > lyi lel 
1.8. 


lz —yl> leil iyl (41) 


The distance between two elements is defined by g(x,y) = || £ — y Il, 
and o(2, y) is easily seen to satisfy all three conditions of the definition 
of metric space, i.e. every linear normed space is at the same time a 
metric space, so that everything said about metric spaces holds for 
linear normed spaces. The norm can be expressed in terms of distance 
by the obvious formula || x || = e(% — 9). 

If we add the requirement that the space be complete, a linear 
normed space is called a type B or B space. Everything below refers 
to B spaces. 

An incomplete linear normed space can be made complete by the 
completion of [85]. The norm is defined for an added element by 
|| z || = e(2 — @). All the axioms are retained on completion, and 
in particular, axiom A. This latter follows from the continuity 
of the sum x + y and the product az, which will be discussed below. 
We shall encounter later convergent sequences of numbers and 
elements. As above, we shall write a, —> a, for convergent sequences 
of numbers, and 2, => 2, for convergent element sequences. 

The convergence £n = V, is equivalent to || 2) — Tn || —> 0. We 
can consider in B infinite series u, + Ug + U3 + ..., where ug € B 
(k = 1,2,...). We shall write £n = u, + U + ... + Un 

If the sequence z, of elements of B has a limit 2, we say that the 
series indicated is convergent and has the sum Zp. 

Let us show that x + y and az are continuous, ie. if t => tọ 
Yn=> Yo and an—>dp then Tn + Yn = Lo H Yo, and ,%,—> 
=> Ay Xo We have 


|| (% + Yo) — (£n + Yn) || < || £o — nll + Il Yo — Yn |l- 


The right-hand side tends to zero, so that the left does the same, i.e. 
En + Yn => To H Yo We write Ayo — ann BB Ay Ly — An Lo F 
+ an £o — An Tn, and we have 


|| ao £o — an £n || < |l a0 o — an £o ||+ || On Xo — A, £n || = 


= |ao—an i || %o|] + lanl |] 20 — zn ll» 


284 METRIC AND NORMED SPACES [95 


the right-hand side of which tends to zero, since an > a, implies that 
|an | is bounded. Notice further that, if £n = Zo then || x, || > 
—> || go ||. This follows from || æn || = e(%,, 8) and the continuity of 
the distance. We define a lineal in B as follows: a set of elements of U 
is called a lineal when the condition is satisfied: if xz, € U (k = 1, 2, 

..,m), any linear combination c; 2, + Cz £o + ... + ¢m%m,€ U. It 
is sufficient to verify that, if z and y € U, then z + y € U and ag € U 
for any choice of the number a. On putting a = 0, we see that the zero 
element belongs to every non-empty lineal. A closed lineal will be 
called a subspace. It is easily seen that, if U is a non-closed lineal, the 
closed set U is a subspace, i.e. the closure of a lineal leads to a subspace. 
This follows from the fact proved above, that x + y and az are con- 
tinuous. If a set U is not a lineal, on forming all possible finite linear 
combinations c} 2, -+ ¢,% + ... + Cm Em of elements x, € U, we get a 
new set of elements V, which will be a lineal. It is usually termed the 
linear envelope of U. This is the least lineal containing U. 

If x, %, ..-,%, are linearly independent elements, the set U of 
elements of B, expressible by x = c; 2%, + Cz £a + ... + cy 2% with 
every possible choice of numbers cs, is obviously a lineal. It is easily 
shown that this lineal is a closed set (subspace). In view of the linear 
independence of the z,, the representation of x by the above formula 
is unique. Such a lineal is usually described as finite-dimensional. All 
the elements of U can be expressed by: x = c, Yı + Y + -.- + 
+ c.y,, where y; (s= 1,2, ..., k) are any linearly independent 
elements of U and c; are arbitrary numbers, and the number of terms 
is equal to k in any formula expressing an element of U in this form. 
This number is called the dimensionality of U. 

Notice that every B subspace is also a B space. 

We have already defined isometry for metric spaces. Let us define 
isometry for B spaces. Two such spaces X and X’ are said to be iso- 
metric if a one-to-one correspondence can be established between their 
elements such that: (1) if v and 2’, y and y’ are any two pairs of cor- 
responding elements of X and X’, then av + by and az’ + by’ are 
also corresponding elements, for any choice of the numbers a and b; 
(2) the norms of corresponding elements are equal. 

It follows from the above that the zero elements of X and X’ must 
be corresponding elements and that the distance between correspond- 
ing elements is the same in X and X’. 

There is no sense in distinguishing between isometric spaces from 
the point of view of abstract theory, and we shall write X = X’. 


97] OPERATORS IN NORMED SPACES 285 


96. Examples of normed spaces. 1. All the spaces mentioned in [87] except 
for s and S are B spaces, if we put || x || = (0, x) for them. In the case of sequence 
spaces, multiplication of an element by a number a amounts, by definition, 
to multiplication of each number of the sequence by a, whilst addition amounts 
to addition of the numbers having the same index in the sequences: 





alp Ez...) = (0G, afn a) (Ex, $e) + (ms Mas ++) = (& +m, G+ e) 


In the case of function spaces, multiplication of an element by a number 
a amounts, by definition, to multiplication of the function by a, and addition 
of elements to addition of the corresponding functions. The zero element in a 
sequence space is the sequence consisting of zeros, and in the function space 
to the function that vanishes identically (in C and V) or is equivalent to zero 
(in M, S and Lp). 

2. Let us take a bounded domain D in R, and the set C of functions g(x) 
having continuous partial derivatives up to order } inside D, where the deriva- 
tives have limiting values on the boundary of D and are functions continuous 
in the closed domain D. For brevity, we shall say in this case that the function 
has derivatives continuous in D. The set of functions in question is a linear 
space. We introduce the norm into it: 


lp = max | Dg, (42) 
x€D 
O<ksl 


where Dm denotes any kth order derivative. The maximum is taken over 
all x belonging to D, for the g(x) and its derivatives up to order J. It is easily 
seen that the norm satisfies the three basic conditions [95]. 

Convergence in C) is a uniform convergence in D of a function and all its 
derivatives up to order 1. By Cauchy’s convergence test and the familiar theorem 
on term by term differentiation of a function sequence, we can say that, if the 
sequence of elements p(x) € C is mutually convergent, it is convergent to 
some element g(x) € OÙ, ie. CM is a B space. 


97. Operators in normed spaces. We have already defined operators 
in metric spaces X. New points arise in linear normed spaces. We shall 
assume that an operator A is defined on some lineal D(A) of the B 
space X, whilst the set of its values R(A) belongs to another B space 
X’. An operator is said to be distributive, if, for x, € D(A) and any 
numbers cg, we have 


A (ety + Caty +... 16m 2m) = cy AX, + C, At, +... + Cm ALm (43) 


It is sufficient to show that A(c, zı) = cAz and A(x + y) = Az +- 
+ Ay. It follows at once from (43) that R(A) is a lineal in X’ and that, 
if 6 is the zero element in X and 6’ that in X’, then 40 = 8’. For, 
A(@) = A(0xz), where x € D(A), but A(0x) = 0Azr = 8’. 


286 METRIO AND NORMED SPACES [97 


We shall only discuss distributive operators below, defined on lineals. 
Let us recall the definition of continuity. An operator A is said to be 
continuous on an element x, when the condition is satisfied: if £n 
(n = 1,2,...) and x € D(A) and th= 2, in X, then Az, = Ay 
in X’. It is easily shown that, if a distributive operator A is continuous 
on an element y, € D(A), it is continuous on any element z € D(A). 
Let zn and 2,€D(A) and 2,=>2,; we have to show that Az,=> Azo. 
We form the elements Yn = (Zn — Zo) + yy of D(A), where Yn => Yo 
We have Az, = Az, + Ay, — AJo, and, since Ay, = Ayy, we have 
Az, => AZ. 

Thus there is no point in talking of continuity on an element of D(A); 

_ we must talk of continuity on the whole of D(A). A distributive oper- 
ator A is said to be bounded if there exists a positive number C such 
that, for any 2 € D(A): 


|| Az, || < C || æ]. (44) 


Notice that the norm on the right is taken in X, and in X’ on the 
left. We shall prove that, for a distributive operator, boundedness 
and continuity in D(A) are equivalent. 

By what has been said, it is sufficient to consider continuity on the 
zero element 0. Let (44) hold. We shall show that, if xn € D(A) and 
T= 6, then Azt, => 0’. It follows from z= 6 that, given any 
£ > 0, there exists an N such that || £n || < «for n > N and, by (44), 
|| Azn || < Oe for n > N, whence, since e is arbitrary, we have 
Az, => 6’. Now let At, = 0’ if t= 0, and let us prove (44). 
If x = 6, (44) reduces to || 0’ || <C || @ ||, ie. 0 < CO, which is ful- 
filled (with the = sign) for any choice of C; thus it is sufficient to prove 
(44) for x # 0. We use reductio ad absurdum. If (44) is not valid, there 
exists a sequence 2, € D(A) (|| nl] > 0), such that || Ag, || = 
= Cn || £n ||, where C,— +œ. On introducing the elements z, = 
= (H/C, || £n ||)an€ D(A), for which || 2, ||» 0, we get || Az, || = 1, 
which contradicts the continuity of the operator Az on the zero 
element. 

If A is the annihilation operator, i.e. Av= 0’ for any x € D(A), we 
can put C = 0 in (44). We must have C > 0 for any other operator, 
and there exists a least positive C for which (44) holds. It is called 
the norm of operator A and is obtained from the formula: 


na = sup || Ar||. (45) 
(x|}=1 
x€ D(A) 


97] OPERATORS IN NORMED SPACES 287 


The norm 7, is also written as || A ||, so that we can write 
|| Ax||<n,l]z|| or || Axl] < allel. (46) 


The above remarks are exactly analogous to those made say in 
{IV; 36] for a particular case. 

THEOREM 1. If the lineal D(A) for a distributive bounded operator A 
is dense in X, A can be extended to the whole of X whilst preserving its 
norm and distributive property. Since D(A) = X, we can write any 
element x, € X as a limit £n = zo, where a, € D(A). We shall show 
that a limit of Az, exists in X, which is independent of the sequence 
£n taken. In fact, 


|| Atn — Am || = |} A (En — m) |i < |] A] | En — Xm ll 


and, since 2%, = £o, the right-hand side > 0 as n and m —> +œ; but 
now || Av, — Azm || > 0, and a limit of Az, exists, since X’ is complete. 
It remains to show that the limit is independent of the choice of 
sequence. Let a, and x, € D(A), where %,=> 2) and £= To 
We have to show that the limits of Az, and Az, are the same. The 
actual existence of the limits follows from the above. 

It can easily be seen that the sequence 2,, £i, To, 12, Tg, Tg, -.. also 
has the limit xz). Thus the sequence Az,, Azi, Ax,, Azz, A£, AT3, ... 
has some limit y € X’. But now the subsequences Az, and Ax; have 
the limit y, i.e. the same limit. 

If z € X, but z € D(A), we take a sequence z, € D(A) and 
In => za and put Az, = lim Az,. We show that the operator thus 


Noo 


defined in X is distributive and that its norm is not increased on 

passing from D(A) to X. Let x and x€ X, xz, and x, be two sequences 

of D(A) having limits xj and zg (if e.g. xy € D(A), we can put all the 

£a = 2%). On recalling that operator A is distributive in D(A), and 

that addition and multiplication by a number are continuous, we get 

A (city + Cz £o) = lim A (c; £h + Cz 8n) = cı lim Az, + c, ae Azn = 
n—=vo n—=æ n- 


+ n 
= cı Ax) + c, Ax. 


The preservation of the norm follows from || Ash || < || A || || £z Il, 
in which the || A || on the right is the norm of A in D(A), after a 
passage to the limit. The norm can obviously not decrease. The theorem 
is proved. 

The above method of extending A is usually called an extension in 
continuity. 


288 METRIC AND NORMED SPACES [97 


Let us show further that the extension of A of D(A) onto X is 
unique in the ordinary sense: if B is a distributive operator bounded 
in X, coinciding with A on D(4), B must coincide everywhere with 
the extension of A in continuity. Let 2, € D(A), £n € D(A) and 
n = Ly. Since B is continuous and coincides with A on D(A), 
we have 

Bx, = lim Bu, = lim Az, = Azo 


n= >œ n->~ 


which is what we had prove. If the operator A is defined throughout 
space X, is distributive and bounded, we shall call it a linear operator 
(it is occasionally called a bounded linear operator in this case). If 
different Az of R(A) correspond to different x of the lineal D(A), an 
inverse operator exists, defined on F(A) and associating each x’ of 
R(A) with a unique element x of D(A), connected with 2 by the: 
relationship 2’ = Az. Since A is distributive, it follows at once that 
A-} is distributive. But the fact that A is bounded does not imply 
that A~! is bounded. 

Let us take as an example the operator ¢ = Af on the segment [0, 1] 


in space C: 
x 


p (x) = f f (t) dt, (47) 


0 


where we assume that X’ is space C itself. This is possible, since 
g(x) € C on [0,1]. Operator (47) transforms the whole of C into a 
lineal, consisting of the functions g(x) having a continuous derivative 
and vanishing at x = 0. There exists on this lineal of functions (z) 
an inverse distributive operator f(x) = p(x}, but it is unbounded. 
For, the functions p(x) = sin naz belonging to this lineal have norm 1 
for any choice of the number n, whilst ¢;,(zx) = nx cos nrg has norm 
nz, which increases indefinitely as n —> œ. 

THEOREM 2. If B is a linear operator in X, R(B)c X and || B || = 
== a < 1l, the operator (E — B), where E is the operator of the identity 
transformation (i.e. Ex = x), has an inverse (E — B)-, defined on the 
whole of X, which is distributive and bounded. 

We consider the equation 


y =x — Bu (48), or x =y + Bz, (49) 


where y is given and x is the required element of X. It is easily shown 
that the operator Ax = y + Bz (A is the notation of [86]) satisfies 


98] LINEAR FUNCTIONALS 289 


the condition for the principle of compressed mappings to be applic- 
able. For, 


|| Aw, — Az, || = || B (z3 — z1) li < alz; — zll- (0<a<1). 


Thus, equation (48) or (49) has a unique solution for any y € X, i.e. 
there exists an inverse operator x = (E — B)-1y, defined on the whole 
of X. It is obviously distributive. Let us show that it is bounded. 
We have 

|x —y||=|| Bel] < alj æ | 
and all the more 


| 


whence it follows that the norm of (E — B)~! is not greater than 
1/1 — a). 

THEOREM 3. A linear operator maps a compact set into a compact 
set. Let U be a compact set of elements of X, £n (n = 1, 2, ...) bea 
sequence of elements of U and A a linear operator. We have to show 
that we can extract from the sequence Az, (n = 1, 2,...) a sub- 
sequence convergent in X’. The compactness of U implies the existence 
of a subsequence zn, having the limit: £n, => zo in X. Now, since A 
is continuous, we have Az,,=> Az, in X’, which is what we set out 
to prove. 

We shall state the following theorem without proof. 

THEOREM 4. If a linear operator A is defined in a (B) space X, map- 
ping X one-to-one into the whole of space X’ (type B), the inverse 
operator A-t! (defined in the whole of X’) is also linear. 


p 1 
[> lial <alel ie lel] <x lyi 








z 





98. Linear functionals. Let us consider a real B space X (the 
elements of X are only multiplied by real numbers). An operator, the 
domain of values of which lies in real number space, is called a func- 
tional in X. The real number space is the real B space with the usual 
definition of addition of real numbers and of multiplication of them 
by a real number. The norm is the absolute value of the real number 
[87]. 

Everything said in [97] about operators also holds for functionals. 
A functional is bounded if 


æ <el (50) 
where | U(x) | is the absolute value of the real number łl(x) and || 2 || is 


the norm of /(x). A linear functional is a particular case of a linear 
operator. Here, D(L) coincides with the whole of X. 


290 METRIO AND NORMED SPACES [98 


THEOREM 1. If a distributive bounded functional I(x) is given on a 
lineal U, it can be extended to the whole of X in such a way that I(x) 
becomes a linear functional in X with the same norm as in U. 

By hypothesis, in addition to the fact that I(x) is distributive, we 
have 


| 2x) | < Ile llel (x€U), (51) 


where ||/||y is the norm of I(x) in U. We shall assume in the proof 
that space X is separable, which simplifies the arguments. The theorem 
still holds, however, for non-separable spaces. 

Since X is separable, there exists a denumerable set of elements, 
dense in X. We leave in this denumerable set only the elements that 
do not belong to U. If there are no such elements, U is everywhere dense 
in X, and we can extend /(z) in continuity on to the whole of X [97]. 
Otherwise, the remaining elements of the denumerable set can also 
be numbered: 2, La, Tg, « 

We take the set U, of elements Zz of the form: z = y + tx,, where y 
is any element of U and ¢ is any real number. It is easily seen that U,, 
like U, is a lineal. Let us show that the above expression of z is unique. 
If z has two different forms: 


z =y + tri =y + t't, (52) 


t Æ t’ in these forms, since if t = t’, then y = y’. We show that ¢ # t’ 
leads to an absurdity. We have from (52): 2, = (y’ — y(t — t’), 
whence it follows that v, € U, and this contradicts what has been 
said. We now take any two elements 2’ and x” of U and establish an 
inequality. We have 


L(x) — U(x") = U(x! — 2") < |lu |e’ — 2" |. (53) 


Notice that, if the left-hand side is a negative number, the in- 
equality is trivial. On observing that || x’ — æ” || = || (£ + zi) — 
— (x" + %) || < |] 2’ + z || + |] 2" + zll, we get by (53): 


Ua’) — |] Ello lle’? + a {| < Ue") + [Ello |] 2" + a II. 


On taking the strict lower bound of the set of numbers on the right, 
and the strict upper bound for the left-hand side, when 2’ and g” run 
independently over the whole of U, we get 


sup [1 (x) — ||? lly lle + æl] < inf [Pæ + Illu lle + a |I], 
x€U x€U 


where the right-hand side is obviously finite, i.e. so is the left. Thus a 


98} LINEAR FUNCTIONALS 292 
real number @ exists, satisfying 

sup [2 (x) — |[2 lu le + æl] <a < inf [tæ + [Ulu je + all]. (54) 
xgeU x€Uu 


We now extend I(x) from U onto U,. Let z = y + ta, be any element. 
of U,. We put 
l (z) =l (y) — ta, (55) 


where a is a fixed real number satisfying (54). If z € U, then t = 0, 
and l(z) coincides with I(y), i.e. (55) defines a functional on U, coin- 
ciding with the previous functional on U. We have therefore retained 
the previous notation for the extended functional. The fact that 1(z). 
is distributive follows at once from (55), the distributiveness of l(y} 
on U and the formulae 


z = y + tæi; cz = cy + ctx; 
zZ =y Ht =y +a; z az = y p y) H lE Ht). 
We finally show that the norm of 1(z) in U, is not greater than |] Z ||v 


(it cannot be lowered). We shall assumet > 0. On observing that U(y) = 
= (yt), where y/t € U, we get 


I (2) = i[t(+-y) —a]. (56) 
But it follows from (54) that 
a>i(zv)—ltly |r +|) 


and on replacing a in (56) by the smaller number on the right-hand 
side of this inequality, we get (t > 0): 


l 
(2) < tltu iy + e| = llu ly + te = lo le 
We turn to the case ¢ < 0. It follows from (54) that 
1 1 
a< uy) F (tle | 47+ zı 


and, on replacing a in the difference Uy/t) — a of (56) by a greater 
number, we get (|t| = —?): 


(-9)—4> — lieira 











= — litle ly + =F Molla 


On multiplying both sides of this inequality by the negative ¢ and 
taking (56) into account, we get U(z) < || ¿|lu |}z||. If ¿= 0, then 
z € U, and the inequality is obvious. Thus we have /(z) < || Zilu [| 2 || 
for any z € U,. On replacing z in this inequality by (—z) and noting 


292 METRIC AND NORMED SPACES [98 


that U(—z) = —ł{z) and || —2 || = || 2 ||, we get, —Uz) < |[2llu Il 2 [l 
These two inequalities finally lead us to 


M2 < llla lle), (57) 


from which it follows that the norm remains unchanged with our ex- 
tension of I(x) from U to U,. 

We now turn to a further extension. If an element z, of the above- 
mentioned sequence belongs to U,, we throw it away. If not, we 
extend I(x) as above from the lineal U, to the lineal U,, consisting of 
elements z = y + t£ where y is any element of U, and ¢ is any real 
number. We proceed in this way, whilst retaining the previous nota- 
tion £, %, ... for elements that have not been thrown away in our 
construction (there may be a finite number of them). We thus con- 
tinue (x) on to the lineal V of elements having the form 


Y + citi + Ca¥ +... +€,2%,, 


where y is any element of U, n is any positive integer (not greater 
than the number of elements 2,, if this number is finite), and c, are 
any real numbers. This lineal V is dense in B, and the functional I(x) 
is distributive and bounded on it, with the norm ||/||y. It now 
remains to extend U(x) on to the whole of X in continuity. The 
theorem is proved. 

A proof of Theorem 1 for the case of a complex B space may be 
found say in G. A. Sukhomlinov’s article (Matem. sb., 3, 1938), and in 
the book Leçons d’ Analyse Fonctionnelle by F. Riesz and B. Sz. Nagy 
(Budapest, 1953; translation published by Ungar, N. Y., 1955). 

The theorem cannot be extended to operators. 

THEOREM 2. If x, is any fixed element of X, different from zero, a 
linear functional U(x) exists with unit norm ( || L || = 1) such that U(x) = 
= || zo Il. 

We take the lineal U of elements of the form x = tx, where t is any 
real number, and we define a distributive functional (xz) on U by the 
formula U(tr)) = t || x ||. When t= 1, we have I(x ) = |! a || and 
\| 2 ilu=1. By Theorem 1, we can extend I(x) on to the whole of X whilst 
preserving the norm, and the theorem is proved. One consequence 
is that functionals with positive norms exist in every B space. 

Notice that the theorem also holds when x, = @ is the zero element. 
We only need to take any element z,, different from 6, and form in 
accordance with the theorem the linear functional I(x) such that 
{| 2 || = 1 and (z) = || z ||. For this k(x) = (6) = 0 = || aq Il- 


99] CONJUGATE SPACES 293 


99. Conjugate spaces. We consider the space X*, the elements of 
which are all possible linear functionals in the B space X. Functionals 
in X, ie. real numbers, corresponding to an element z, are denoted by 
U(x), m(x), n(x), ... Regarded as elements of X*, we denote them by 
the single letter: l, m, n, ... X* is a linear space. Addition and multi- 
plication by a number are introduced for functionals in the following 
natural way: 

(l+ m) (x) =1 (x) + m (zx); (al) x = al (x), 
all the properties of axiom A being fulfilled. The zero element of X* 
is the annihilation functional, i.e. such that I(x) = 0 for any s € X. 
The norm / of an element of X* is taken equal to the norm of the cor- 
responding functional. This norm is > 0, where the = sign only holds 
for the annihilation functional. The two other properties of axiom 
C hold. 

The second follows from |J,(x) +,(z) | < | d(x) | +£) | < 
< [idl Mall + Ute li Mel = (NGL ill Il) lla ll, whilst the third 
is obvious. Let us show that X* is a complete space. Suppose we have 
a mutually convergent sequence of elements of X*: 


[Un — m|| >0 as n and m—-+oco. (58) 


We have to show that there exists an element / € X* such that 
\||2—U, || +0 as n> œ. We write lIn — lm = lnm € X*. We have 
In = lm + lnm and || tn || < |] lm |] + || dam ||. By (58), there exists an 
N such that || lam || < 1 for nand m > N. Having fixed m = my > N, 
we get || tn || < || lm, {| + 1 for n > N, whilst there is a greatest 
among the finite number of non-negative numbers || /, || (n = 1, 2, 

.., N — 1). Hence it follows from (58) that there exists a C > 0 such 
that || l| <C for all subscripts n. We have further: | l,(x) — 
— | lm) | < || ln — lm || || £ ||, and it follows from (58) that | ln(x) — 
— In(x) | > 0 as n and m —> œ with any choice of x € X. By Cauchy’s 
test (for numbers), the number sequence ln(x) has a limit. This limit, 
which we write as I(x) (ln(x) —> U(x)) is some functional defined in the 
whole of X. Let us show that it is a linear functional. The equations: 
L(x + y) = l£) + L(y) and l,(az) = al,(x) show that it is distribu- 
tive, and ||l,|| < O, ie. |1,(x)| < C || x || (by means of a passage to 
the limit as n — œ) that it is bounded. Thus /(z) is a linear functional 
in X, i.e. 2 € X*. It remains to show that |! 2 — J, || —> 0. Given any 
€ > 0, by (58), there exists an N such that | l,(xz) — ln(z) | < e || x || 
for n and m > N. On passing to the limit as m — ©, we get | ln(x) — 
— l(x)| < e || æ || for n > N, ie. |l — ni| <e for n > N, which 


294 METRIC AND NORMED SPACES (99 


gives || 2 — ln || > 0. This shows that X* is complete. Thus X* is a B 
space; it is called the conjugate to X. 

We introduce the second conjugate space X** = (X*)*, the elements 
of which are all possible linear functionals in X*. Space X** is obtain- 
ed from X* in the same way as X* from X, and X**is a B space. 

If we fix x € X, for any element l € X* there will be a corresponding 
definite real number /(2), i.e. U(x) is a functional in X* for x fixed and 
l varying in X*. Let us denote it by the symbol L,(J). It follows from 
(L+) (x) = 1, (x) + L(x) and (al) (x) = al, (x) that Lli + h) = 
= Lel) + £,(1,) and L,(al) = aL,()), i.e. £,(l) is a distributive func- 
tional in X*, 

Since L,(1) = U(x), it follows from 


|Z (a) | <Je l (59) 
that 
| Ly (2)| < |e | eli, (60) 


where || x || is the norm of x in X and || ¿|| is the norm of lin X*. It 
follows from (60) that the norm of the functional Z,(/) in X* is not 
greater than ||2||, i.e. is boundedin X*. Thus L,(/) is a linear functional 
in X*. If x = 0 is the zero element in X, then L(ł) = (0) = 0 for any 
l € X*, ie. L,(l) is the zero element in X**. It follows from (60) that 
the norm of L,(i) is not greater than || x ||. But by Theorem 2 of [98], 
given any x 4 0, there exists a functional l(z) such that U(x) = || x || 
and || ¿|| = 1. For such an J, both sides of (60) are equal to || x || 
and the = sign holds, whence it follows that the norm of L,(1) is equal 
to ||a||. Further, it follows from l(z, + za) = Kx,)-+ U(x.) and 
Uc, x) = c U(x,) that Lx xl) = Ly, (D + Lx, (0), Lex, (O = c Lx, (0) 
and in general Lex, +¢,x, (1) = cy Lx, (I) + ¢ Ly, (2). In particular, Z,,_x, 
(1) = Ly, (D — Lx, (1) whence it follows, in view of what has been 
said about the norm, that || Lx, — Ly, || = || %, — 2%, ||, which shows 
that distinct elements Ly of space X** correspond to distinct w. 

The following important proposition is a consequence of the above. 

THEOREM. We can associate with every element x € X an element 
L, € X**. In this correspondence, distinct x correspond with distinct Ly, 
addition and multiplication by a number in X correspond to the same 
operations for the corresponding elements in X**, and the norms of cor- 
responding elements in X and X** are the same. 

This proposition enables us to identify L, with x, i.e. to ambed X 
in X**, which is written as X c. X**, In other words, X is isometric 
with part of X**. 


100] WEAK OONVERGENOE OF FUNCTIONALS 295 


We shall see later that X is isometric with the whole of X** in 
certain cases, i.e. X** = X. Such a space X is described as regular. 
The notation (l, x) or (x, l) is often used instead of U(x) = L,(2): 


(x, D) = (l x) =l (x), (61) 


and (x, l) is called the inner product of the element l € X* and the 
element x € X. The elements l and x are said to be orthogonal if 
(x, l) = 0. 
We can write (59) as 

œ D| <e (62) 

and it follows from what has been said that 
(ax, bl) = ab (x, 1); (£1 + £z l) = (£1 l) + (£2 l); 
(x, L + l) z= (x, L) + (x, l) , 


where a and b are any real numbers. 

We have so far considered real B spaces. Everything said above 
still holds for complex B spaces (multiplication of an element x by 
arbitrary complex numbers). In this case functionals may take any 
complex values. In future, we shall write a as previously for the 
complex conjugate of a (a = a + bi; a = a — bi). Notice that, if I(x) 
is a linear functional in B, I(x) is also bounded in X, but is not a linear 
functional, since U(x) is multiplied by ¢ when æ is multiplied by the 
complex number c. The conjugate space X* is defined as the set of all 
U(x), not. of all (x). 

Addition of elements of X* and multiplication by a complex number 
are ordinary addition and multiplication by complex numbers. The 
norm ||} || is taken equal to || Z ||, and | U(x) | < || ¿I| + || æ ||; it is not 
possible to replace || Į || by a smaller number. X* is a B space. The 
inner product is defined by the formulae 

(Q, x)= U(x); (x,l) = (l, x) = l (x). (63) 

Here, for any complex a and b: 

(al, bx) = ab (l, x), (64) 


The space X** = (X*)* has the same connection with X as in the 
case of a real space. 


100. Weak convergence of functionals. We considered in [99] the 
convergence of a sequence of linear functionals /,(x) to the functional 
U(x): || 1 — ln || —> 0 and all the more, 1,(x) —> U(x) for any x € X. This 


296 METRIC AND NORMED SPACES {100 


is usually described as a convergence in norm. It follows from [99] 
that the norms of the /,(z) are bounded, i.e. do not exceed some posi- 
tive number for any n. Let us now introduce a new concept of con- 
vergence. We say that a sequence of linear functionals /,,(z) is weakly 
convergent if, given any x € X, the sequence ln(x) has a (finite) limit. 
Let us write I(x) for this limit. This is a functional defined in the whole 
of X. It is distributive because /,(z) are distributive, and if we knew 
that the sequence || /, || is bounded, we could say that l(z) is bounded 
(and therefore linear). This is in fact the case. 

THEOREM 1. Let L be a set of linear functionals l(x), where there exists 
for any element x a positive number my such that | I(x) | < mx if l € L, i.e. 
given a fixed x, the set of numbers | (x) | is bounded. The norms of the 
functionals I(x) (l € L) are bounded. 

We show first that the theorem can be proved, simply by showing 
that | U(a) | is bounded in any sphere. In fact, let there exist a positive 


auunibee b such that 
| (x)| <b (EL), (65) 


if x belongs to some closed sphere (zp, a) ( ||  — 2» || < a), and let y 
be any element of X differing from zero. The element x = ay/|| y || + £o 
belongs to S(x, a), and by (65), we have 








Tar O tie] <b 
and all the more 
a 
TIO |Z | (2) | <2, 
whence 
b+11 2b : 
[|< Pte y< yil (66) 


for any y € X. i.e. || L || < 2bja, which in fact amounts to the assertion 
of the theorem. Hence, if the set of numbers | I(x) | is bounded in any 
sphere, the theorem is proved. Let us now prove the theorem by 
reductio ad absurdum. Suppose that the set in question is unbounded 
in any closed sphere; we show that this leads to a contradiction. 

We fix a closed sphere §,. By what has been proved, there exists an 
element x, € S, and a functional J, € L, such that |1,(z,) | > 1. Since 
L(x) is continuous, we can assume that 2, lies inside 8, and that the 
inequality L(x) >1 is satisfied throughout S(z,,7r,), where 7, is a 
sufficiently small positive number. As above, there exist an element 
£a lying inside S(2z,,7,), a functional J, € S and a small positive r, 
such that the sphere §(z5, T3) belongs to S(z,,7,) and | l, (£) | > 2 in 


100] WEAK CONVERGENCE OF FUNCTIONALS 297 


the whole of this sphere. On proceeding in this way, we get a sequence 
of embedded spheres 


S (x, r) D S (£, 73) D 8 (%g, 73) D.-. 


and of functionals 1, € L such that | I(x) | > k in the whole of S(2,, rp). 
It can naturally be assumed here that ry —> 0 as k co. Thus | 1,(x) | > 
> k at the point x, belonging to all these spheres [85] which contra- 
dicts the fact that the set of numbers | l;(£ọ) | must be bounded. The 
theorem is proved. 

If a sequence of functionals /,(r) has a finite limit for any x, the 
number sequence | ln(z) | is bounded for any v, and by the theorem, 
the sequence || Zn || is bounded. As we remarked above, it follows 
from this that the limit l(x) of a weakly convergent sequence of linear 
functionals is also a linear functional. 

If the sequence || /, || is bounded, the existence of a limit of ln(x) 
only on a lineal dense in X, and not on the whole of X, proves to be 
sufficient for the weak convergence of the functionals. 

THEOREM 2. The necessary and sufficient condition for a sequence of 
functionals 1,(x) to be weakly convergent is that the sequence || I, || be 
bounded and that there exist a limit of I,(x) on a lineal dense in X. 
The necessity of the first condition follows from Theorem 1, whilst 
the second is obvious. We turn to the proof of sufficiency. Let U 
denote the lineal mentioned in the second condition. By the first 
condition, || J, || < C, where C is a positive number. On writing U(x) 
for the limit of l (x) for x € U, we can say that I(x) is a distributive 
functional bounded on U (its norm does not exceed C). We can extend 
it in continuity on to the whole of X. We first use L(x) to denote the 
linear functional thus obtained, and show that ln (£) —> Ux) for any 
x€ X. If x€U, this is true. Let 2 € U. Given any e > 0, there 
exists an element zé € U such that || x — 24 || < ¢/4C. On recalling 
that the norms of J,(x) and l(a) do not exceed C, we can write 


| (æo) — Iq (9) | < [L (20) — L (es) | + | La) — ln (26) | + 
+ | bn (20) — ln (£0) | < [ZI] | £o — xo il + |L (£6) — ln (£0) | + 
+ itall fle — sall < $ + (L) — bn (28) |: 


But 1,(%) > l(xo), so that there exists a subscript N such that 
| (x6) — In(zé) | < e/2 for n > N, and from the previous inequality: 
| Uto) — ln(z) | < ¢ for n > N, whence it follows that ln(£o) > U(2,). 


298 METRIC AND NORMED SPACES [101 


Note. The second condition of the theorem can be replaced by 
the following: there exists a limit of J,(x) on a set of elements V, the 
linear envelope of which U (a lineal) is dense in X. For, since /,(z) is 
distributive, it follows from the convergence of /,(z) on V that ln(z) 
is convergent on U. 

The concept of the weak convergence of functionals leads naturally 
to the concept of weak compactness. A set W of elements of X* is said 
to be weakly compact if we can extract a weakly convergent subse- 
quence from any sequence of functionals J, € W. 

THEOREM 3. If X is separable, any bounded set of functionals (|| l || < 
<7 (r > 0)) is weakly compact. 

We have to show that, if the norms are bounded for a sequence of 
linear functionals l (x): || ln || < 7, we can choose a subsequence /,,,(2), 
convergent for every x € X. Let 2, 2,,... be a denumerable set V of 
elements of X, dense in X. Given any m, we have | 1,(%m) | <7 || £m ||, 
i.e. the sequence of numbers /,(2,) (n = 1, 2, 3, ...) is bounded. On 
applying the usual diagonal process [IV; 15], we form a subsequence 
1,,(%), convergent on all the elements of V. It follows from the note 
on the previous theorem that the sequence is convergent on the whole 
of X, and the theorem is proved. 

Note. Every bounded set of elements of X* is obviously also 
weakly compact, since it can be included in some sphere ||} || < r. 


101. The weak convergence of elements. We now introduce the 
concept of the weak convergence of elements of the B space X. We 
say that a sequence £n of elements of X is weakly convergent to an 
element 2, and write £n a, if U(a,)—+Ua«) for any linear functional 
U(x). The element z is called the weak limit of æn. Let us show that a 
sequence {zn} cannot be weakly convergent to more than one limit. 
In fact, if Enta and En Yo we have by definition: (æn) —> Uzo) 
and (£n) —> Uy) for any l € X*, whence Hyo) = L£) or UY — To) = 0 
for any l € X*. But if y, Æ %, ie. Yo — To is not the zero element, 
there exists an element 1 € X* such that Uy) — %) = || Yo — To || > 0, 
which contradicts what we said above, and our assertion is thus proved. 
If nx, it is obvious that every subsequence En, ay. If tn % 2p, 
Yn Æ Yo: and an —> ao then an Tn to Zo Tn + Yn To + Yọ- This 
follows at once from the distributive property of the functionals. 

Convergence in norm, || £e — £n || —— 0, which we wrote above as 
n= Tp, is sometimes called strong convergence. We have simply 
called it convergence. In view of the continuity of a linear functional, 


101] THE WEAK CONVERGENOE OF ELEMENTS 299 


it follows from £p = % that U(xz,) > l(z,) for any 1 € X*, ie. weak 
convergence follows from strong convergence. 

Strong does not in general follow from weak convergence. Let us 
give an example. We take the space L, on the segment [0, 1]. As the 
sequence 2, we take the functions sin nat (n = 1, 2, ...). 

As will be shown below [102], the general form of a functional in 
L, (0, 1] is given by 


1 
L(x) = f F(t) æ (i) dt, 
0 


where /(t) is a fixed function of L, [0, 1] and 2(¢) is any element of this 
space. In particular, we have for the elements 2,(é) = sin nat: 


1 


l (2n) = f f(t) sin nxtdt, 
0 


whence it is clear that, discounting the factor V2, (£n) are the Fourier 
coefficients of f(t) with respect to the system sin nat on [0, 1]. We know 
that U(z,)—» 0 with any choice of f(t) of L, [0,1], i.e. (xn) > (8), 
where 6 is the zero element of L, (0, 1] (the function equivalent to 
zero). Thus sin nat % @ as n—> œ in L, [0, 1]. At the same time there 
is no strong convergence, since 


1 
I8 — 2, |[2 = J sin?n xt dt = >. 


THrorEM 1. If tp“ 2», the sequence || xp || is bounded. 

We can regard z, and 2, as elements of X**, It now follows from 
Zn qt, that the functionals in X* corresponding to zp are weakly 
convergent to the functional corresponding to 2. But, by Theorem 1 
of [100], the norms of these functionals, equal to || æn ||, form a 
bounded set, and the theorem is proved. 

The weak compactness of a set of elements of X is defined like the 
weak compactness ofaset of functionals. Every bounded set of elements 
x (|| x || < C) is a bounded set in X**, but this latter set is weakly 
compact, as a set of functionals in X*, when X* is separable. If X is a 
regular space, i.e. X** = X, it follows from what has been said that 

Turorem 2. If X is regular, whilst X and X* are separable, every 
bounded set of elements of X is weakly compact. 

Notice that, if X is not a regular space, i.e. X** is wider than X, 
the limit of a sequence of elements of X** can be an element of X** 


300 METRIC AND NORMED SPACES [101 


for which there is no corresponding element of X. It can be shown that, 
if X is separable and regular, X* is also separable. 

An immediate consequence of Theorem 2 of [100] and the above 
correspondence of elements of X** to elements x € X is: 

THEOREM 3. The necessary and sufficient condition for a sequence Xp 
of elements of a regular space X to be weakly convergent is that the se- 
quence || £n || be bounded and that there exist a limit of U(x,) on some lineal 
U of elements l € X*, dense in X*. 

As in [100], the lineal U can be replaced by a set V of elements of X* 
whose linear envelope is dense in X*. 

THEOREM 4. Let A be a linear operator in the B space X and R(A) 
belong to the (also B) space X’. If a, “x, in X, then Aa, % Ax, in X’. 

Let m(y) be any linear functional in X’. It is easily seen that m( Ax) 
is a linear functional in X, and zn% 2, in X implies m(Aa,)—» m Aa). 
This holds for any functional in X’, so that Az, % Ax, in X’, and the 
theorem is proved. We know that a linear operator is continuous in 
the sense of strong convergence. Theorem 4 asserts that it is continuous 
also in the sense of weak convergence. 

We have seen that, if £n = 2%, then || 2p ||—> || £o || [95]. This 
property may not hold for weak convergence. To return to the above 
example of the sequence of functions sin naz of L, on [0, 1], we have 
seen that sin nng “% 0, whilst || sin nazz || = 1//2. . 

THEOREM 5. If 2,“ ap, then || tq || < lim || æn ||. Notice first of all 
that lim || zp || is finite by theorem 1. We use reductio ad absurdum. 
Let || zo || > lim || £n ||. We take a number m satisfying the inequality 


|| || > m > lim || z, ||. (67) 

It follows from this that an N exists such that || z, || < mforn > N. 
Further, there is a linear functional (x) such that (æ) = || 2, || and 
|I || = 1 [98], and we have | Uap) | < IEI © |] an Il, ie. | £n) | < 


< || £n || < m for n > N, whilst, by (67), U(x) = || x || > m. 

Thus (xn) does not tend to U(z,), which contradicts the hypothesis, 
and the theorem is proved. 

Suppose that space X satisfies the following condition: given any 
6 > 0, there exists a number 7 > 1 such that, if || z || = || y|[=1 


1 
and || 2 — y || = 6, then zay < n. We say here that X is 


a uniformly convex space. The following assertion holds for such 
spaces: if %,>2, and || £n || > || £o ||, then £n = 2. We shall prove 
this later for the particular case of Hilbert space. The property of 


101) THE WEAK CONVERGENCE OF ELEMENTS 301 


uniform convexity holds for spaces Lp with p > 1 (see, e.g., S.L. 
Sobolev, Some Applications of Functional Analysis to Mathematical 
Physics (Nekotorye primeneniya funktsional’nogo analiza v matema- 
ticheskoi fizike)). 
THEOREM 6. If £n “> x, then x, belongs to the closure (in norm) of 
the linear envelope of the set of elements z, (n = 1, 2, ...). 
We use reductio ad absurdum.. Let U be the linear envelope of elements 
Xn, and let x, not belong to VU, i.e. 
inf |z — y||=d> 0. (68) 
yeu 


We have for any non-zero number £: 


tzo + y|| > lela. (69) 
For, 





I tz +yli= ltl] 


But if y € U, then (1/—t) - y€U, and (69) follows at once from (68). 
We now take a set of elements of the form 


g= tz + y, (70) 


where y € U and t is any number. As in [98], it is easily seen that the 
expression of x in form (70) is unique, and that the set of elements x 
in question is a lineal. We write it as V and define a distributive 
functional on it by the formula U(x) = t, so that l(y) = 0 if y € U. We 
show that this functional is bounded on FV. Let ¢ 4 0. By (69), we can 
write | ltr, + y)| =|t| < Ifd||tz,+ y ||. This inequality is ob- 
vious when t = 0. 

Thus |] Z || < 1/d on V. We can extend / on to the whole of X with 
the same bound for the norm and obtain a linear functional I(x). By 
definition, 1(x)) = 1 and (zn) = 0, since all a, € U. We see that U(2,) 
does not tend to U(r»), which contradicts the hypothesis: 2, % zp. 
The theorem is proved. 

The last theorem can also be stated as follows: if 2, “% Xo, there 
exists a sequence of linear combinations of elements zp: ct zı + ce t + 
+... + Ch, Tn, (k = 1, 2, ...), which is strongly convergent to zo: 


l 
Zo — zY . 


k k 
ci Ti + ca ta +... + Ch, Tn, > To as k— oo, 

The stronger assertion can be proved: if £n “> zo, there exists a sub- 
sequence Zp, (k = 1, 2, ...) such that 


1 
p (Mm + Eng + +++ + En) > Ly as k— œ. 


302 METRIC AND NORMED SPACES [102 


m 


THEOREM 7. If space X is regular and the sequence x, €X is weakly 
mutually convergent, i.e. (tn) — (Em) = (En — Em) —> 0 as n and m —> 
co for any element l € X*, the sequence x, is weakly convergent. 

It follows from the hypothesis and Cauchy’s test for numerical 
sequences that l(z,) has a limit for any 1 € X*, i.e. the linear functionals 
Ly, (1) = (£n) in X* have a limit for any l € X*. This limit is also a 
linear functional in X*. But X is regular, so that this limiting func- 
tional has the form Lx, (1) = U(x), i.e. for any l € X* we have (zn) > 
—> I(x), ie. 2n = 2, which is what we had to prove. 

Theorem 7 can be stated alternatively as: a regular space X has weak 
completeness. 


102. Linear functionals in C, L, and l, 1. We know that the 
general form of linear functional in C on the finite interval [a, b] is [15]: 


b 
Uf) = J f(x) dg(z) , (71) 


where g(x) is a function of bounded variation, continuous from the 

right and satisfying the condition g(a) = 0, where distinct functions 

g(x) with the indicated properties generate distinct functionals ((f). 
b 


We also know that || ¿|| = Vg(x). We can therefore associate each 
a 
Uf) in C with a function g(x) with the indicated properties and with 
b 
the norm || g || = Vg(x) and identify space C* with the space V of 


a 
these functions. Space V is of type B [96]. 

We now consider functionals in C*, i.e. in V. Such a functional is 
given by (71) for any fixed function f(x) continuous in [a, b]. 

Thus space C is embedded in C**. Let us show that not every 
functional in V is expressible by (71). We take as the functional [,(/) 
in V the sum of the jumps of g(z) and show that it is not expressible 
by (71) whatever the choice of continuous function f(x). We form the 
following element of V: 

0 for a< 
< 


1 for c ees 


galz) Fa | 


If we were to have (71), we should get 1,(g,) = f(c). But the sum of 
the jumps of g (x) is unity, so that f(c) = 1, i.e. the continuous function 


102] LINEAR FUNCTIONALS IN C, Lp AND lp 303 


f(z) = 1. But now (71) gives J,(g) = g(6) — g(a), and this difference 
is not the sum of the jumps for every function g(x) € V. Hence C** 
is wider than C, i.e. C is not a regular space. Everything that has been 
said refers to real functions, but can also be extended to complex 
functions. 

2. We now establish the general form of linear functional in space 
L, (p > 1) of real functions on the bounded measurable set 2, of Rn. 
Let U(f) be such a functional and w,(x) the characteristic function of 
any measurable set Æ belonging to & . Obviously, w,(x) € L,(% o). 
and we can write 


los) = F(B). (72) 
Let us show that this set function, defined for all measurable & of 
Zo is completely additive. Let Z = J, + Z, + ..., where the mea- 
surable sets @, are pairwise disjoint. The series 
2, ent) (73) 
is convergent in L,(&,). For, 


Sonal= |5 i az]? -| Emon). 


k=q Ly(fo) Lk=q ex 














and the last sum tends to zero as r and g— ©, since the measure is 
completely additive. Series (73) is convergent to w(x) at each point 
x of é. 

Consequently it is also convergent to @,(z) in L(g o) [62] (or to 
an equivalent function), i.e. 


wal£) = 2 walt), (74) 


where the convergence can be understood as a convergence in L,(% »)- 
In view of the continuity of the functional (f) in L (Z o), (74) gives us 
F(@) = F(E) + F(Z.) + ..., ie. F(Z) is completely additive. If g^ 


is a set of measure zero, then 


[oet] <I l weta) SEE fas] = 0. 


i.e. l[w,(x)] = 0, if m(#’) = 0, and the theorem of [73] gives the form 
for F(Z): 


F(Z) = J (x) da, 


304 METRIC AND NORMED SPACES [102 


where we can say for the moment, as regards y(x), that it is summable 
on &,. We have thus shown that 


I[we(a)] = J ylz) wlz) dz . 
Since (f) is distributive, we have 
Up) = J y(x) p(x) dx (75) 


for any bounded function g(x) with a finite number of values. We shall 
now show that the formula holds for any bounded g(x) measurable 
on &, (such a function belongs to L,(%,)). We first suppose that 
g(x) > 0. By Theorem 1 of [46], there exists a sequence of functions 
gr(x) with a finite number of finite values such that n(x) —> p(x) 
uniformly on 2. The ,(x) are therefore bounded on @, by the same 
number. We have for the ¢,(2): 


Upp) = J (2) Pala) dx . (76) 


It is obvious that p(x) = p(x) in L (Z ,), and we can pass to the 
limit under the integral sign [54] in the last formula. The continuity 
of the functional I(p) leads to (75), which is thus established for any 
non-negative bounded p(x) measurable on &,. The case of a bounded 
function of any sign reduces at once to the case discussed by writing 
p(x) = pt (x) — p-(x), where pt (x) and p-(x) are the positive and 
negative parts of p(x). We now show that p(x) € Lp(Zo), where 
(1/p) + (1/p’) = 1. On substituting in (75) the bounded measurable 
function (x), defined as follows: 


| p(x) |- sgn p(x) for |y) <N, 
plz) = | yp (77) 
NP’! sgn p(x) for | y(x)| > XN, 
where 
l for a>0, 
sgna =j— 1l fora<0, 
0 fr a=0, 
we get 
Kp) > f lge) Pde, (78) 


since | p(x) | > | p(x) | PT? and p’/(p’ — 1) = p. 
On the other hand, 


Uo) < iUi lel= NILS love) Pde]? , 


102] LINEAR FUNCTIONALS IN C, Ly AND lp 305 
and we have by (78): 
1 
J rŒ Pde < [f | oe) Paa]? , 


whence 
1 
[Jie (a) P dz]? < 
But it follows from (77) that 


= |p(a) |” for | p(x)| < N 
Wee NP’ for | p(x)| >N 
so that 


LS lyve P” <N, (19) 


where y(x) is the cut-off function y(x). Hence it follows that (x) € 
€ Lp(Zo), and that 
Z|] > Iy lepe: (80) 
Now let p(x) be any function of L (8 o). There exists a sequence of 
measurable bounded functions pn(x), which tends to g(x) in L(g 4). 
In view of the continuity of the functional: l(¢,) —> U(y), and by virtue 
of [62]: 
J Væ pole) da> f yle) ox) de 


We have (75) for ,(z), and it follows from what has been said that 
(75) holds for any (x) € L (Zo). By Hölder’s inequality: 


1 
Kg) | = [J | p(x) P dæ]? || ile 


El] < ||P llep >» (81) 


which gives, in conjunction with (80): 


IH =I lleres - (82) 


Thus every linear functional in L,(@,) is expressible by (75), where 
p(x) € Lp (Zo), and (82) holds. 

Let p(x) be any fixed function of Lp (Zo). In view of Hélder’s in- 
equality, (75) yields a linear functional in L (o), the norm of which 
satisfies inequality (81), i.e. (75) is the general formula for a linear 
functional in L,(@,). Since p(p’ — 1) = p’, the function (zx) = 


306 METRIC AND NORMED SPACES [102 


= | y(x) |7! sgn y(x) belongs to L,(% »), and we can substitute this 
(x) in (75): 


I] yla) P= sgn y(a)] = S| v(x) Pde. (83) 
The norm of p(x) = | p(x) |? ~* sin p(x) in L,(&,) is equal to: 
1 1 
[f love) Pae]? = [ f |y) Paz]? , 
and we get from (83) 
Jiva Pas <a LS ve PT, 


whence 

IHES ye) lepes. 
which, in conjunction with (81), again yields (82). Thus formula (75), 
where y(x) is any function of Z,(% o), gives the general form of a linear 
functional in L,(%), where (82) holds. 

Equivalent functions y(x) obviously yield the same functionals 
{coincident on all g(x) € Lp(a). Let us show that non-equivalent 
y(x) yield distinct functionals. This obviously amounts to proving 
the following: if po(z) € Lp(%) and we have for any g(x) € L,(@,): 


3 polz) p(z) da = 0, 


then p,(x) is equivalent to zero. On putting g(x) = | y(x) |7} sin yo(z), 
we get 
J lvl) Pde =0, 


whence it follows at once that y(x) is equivalent to zero [51]. 

Let us now consider space L (f -), where @,, is the entire space Rn. 
As above, it may be shown that, if y(x) € Lp(%..), (75) defines a linear 
functional in L,(@.,), and equation (82) holds. Let us show that every 
functional in L,(%..) is expressible in the form (75), where (zx) € 
E€ L,(&..). We take the functions p(x) of L,(%,,) which vanish outside 
the interval Am (—m < a < +m; k = 1, 2, ...,n). They form the 
space L,(4,,). The functional l(p) for such functions on Lp) is 
also a functional on L,(4,,), and its general form is 


bm (P) = e Vml&) p(x) dz , 


where m(t) € Lp(4m), where || pm(2) Ilean < IZI It follows at 
once from what has been said that y,,4,(z) and y,,(z) are equivalent 


102) LINEAR FUNCTIONALS IN C, Lp AND Jy 307 


on Am for k > 0. We thus obtain the function y(x) € L,(%..), equi- 
valent to pn(x) on Am, and we have 


Up) = J v(x) g(a) de. 


Since finite functions are everywhere dense in L (E), we can conclude 
that everything said above for L,(@,) also holds for L,(@..). The 
results obtained are readily extended to the complex space L,(@ o), the 
functionals being also capable of taking complex values. It follows at 
once from what has been said that the space L}(@,) can be identified 
with L,(%>), and hence with L}*(%,), ie. L¥(€,) coincides with 
Lo). In other words, given a fixed (x) € L,(@,), the right-hand 
of (75) gives the general form for a linear functional in L,(&% o), with 
norm equal to || @ ||z s): Thus Z,(Z o) is a regular space. Since L (Z 9) = 
= Li (€,), and L,(% o) is separable, it can be asserted that every sphere 
in L,(@ o) (or every bounded set) is weakly compact. When p = 2 we 
have p’ = 2; i.e. L3(@,) is L,(Z o). We shall consider this case in detail 
in a subsequent chapter. Everything said above also holds for L,(@..). 
On using the above formula for a linear functional in L,(@ 9) (p > 1), 
we can prove the following theorem: 

THEOREM. If y(x) is measurable on a bounded measurable set  , and 
yp(x)p(x) is summable on E, for any g(x) € LIZo) (p> 1), then 
ylz) € Ly (Zo). 

It follows at once from the hypotheses that y(x) can take an infinite 
value only on a set of measure zero, and we can assume that y(x) only 
takes finite values. We define the function sequence: 


y(x) if jy (s)| <n, 


(84) 
n if|y(x)|>n, 


Pn (z) = | 
which tends to y(x) at every point x. If p(x) is any function of Li.) 
then | yn(x)p(x) | < | p(x)p(z) |, where y(x)g(x) is summable on &, 
by hypothesis. Hence it follows that 
lim {pq (x) p (x) da = f p(x) p(x) da. 
nee ey č 
But the y,(x) are bounded functions, i.e. belong to L,.(&4), and the 
integrals on the left-hand side are linear functionals of g(x) in L,(&,). 
They have a limit on any element g(x) € L,(#,), so that their norms 
are bounded by some number A [100]: 


$ | Mn (x) [Pda < AP’, 
Ča 


308 METRIC AND NORMED SPACES [102 


whence we get in the limit [54]: 
{|y (2) |” de < A”, 
& 


which is what we wanted to prove. 

The case p = 1 is singular. It can be shown that the space If is 
isometric with M (the space of measurable bounded functions) and 
L, is not a regular space. 

3. Let us now consider linear functionals in l, (p > 1). Let U(x) be 
such a functional. Given any elements 2(£,, £, ...) of space lp, there is 
a corresponding cut-off element 2,(&, z -. -> Ën, 0,0, ---), and 
In => Ty, since the series with the general term | $x P is convergent. 
We introduce the elements yp (E, e0?, ...) (k = 1, 2, ...) such that 
é = O for i + kand & = 1. Let U(y,) = ay. Since U(z) is distributive, 


we have U(%,) = a, Ei + Qa ëa +... +@nén, and we obtain, on 
making use of the continuity of I(x): 
l (x) = ag + aa t o. (85) 


Let us consider the numbers a,. We introduce the elements zy 


nh, nf...) on Z as follows: 
p 
N) — |a |P! sgn a, for k< N, 
q 0 tN. 
We have 
N r 
l (zn) = yla 
i= 
and 
hie it N J 
Sia = ex) < Hex li = Zl] È er F, 
im! k=l 
whence 


Pid P <n 


k=l 
and in the limit as N —> oo: 


o 1 
| Slo" <i 
k=l 
i.e. v(a, dy, .--) € lp, and 
ie Il < ILI (86) 


Further, Hölder’s inequality, applied to the sum (85), shows that 
IZIS Io Il, and we obtain, by (86): 


IH = Ie lis- (87) 


103] WEAK CONVERGENCE IN C, Lp AND lp 309 


It can be shown, precisely as for Lp, that (85), where v(a,, dg, ...) 
is any element of ly, gives the general form of a linear functional in lp, 
where the element v is uniquely defined by the functional t(x) and (87) 
holds. Hence it follows that Uf is J,,, and that l, is a regular space. 
A theorem holds, precisely analogous to the theorem proved above: 
if the series 


> OK» 
k=l 


where b, are fixed, is convergent for any choice of (a@,,a,, ...) € lp 
(p > 1), then (9,,92, ---) € Up. 


103. Weak convergence in C, L, and /,. 1. The weak convergence 
of elements /,(z) € C to the element f(z) € C (in a finite interval 
[a@, b]) is defined by 


b b 
lim f fa (2) dg (x) = f f (2) dg (x) (88) 


for any function g(x) of bounded variation. The necessary and sufficient 
conditions for this convergence are: (a) there exists a C > 0 such that 
| f(z) | <O (n = 1,2, ...); (b) a(x) > f(x) for any z € [a, b]. 

Condition (a) follows directly from [101]. Further, if x = z is any 
fixed value of [a, b] and f(x) is any element of C, 1,(f) = f(a) is evi- 
dently a linear functional in O, and, since f,(z)-“- f(x), we must have 
lolfa) > llf), ie. fn(%) > (2%). We now have to show that (88) follows 
from (a) and (b) for any choice of function g(x) of bounded variation. 
Since | f,(z) | < C and f,(x) — f(x), passage to the limit is permissible 
in (88) if we regard the integrals as Lebesgue—Stieltjes integrals [54]. 
But, since f(z) and /,(x) are continuous, these integrals can also be 
regarded as ordinary Stieltjes integrals. 

Notice further that, by the theorem of [101], when conditions (a) 
and (b) are observed, there exists a sequence of linear combinations 
of f,(z) which tends to f(x) uniformly in (a, b]. The functions f,(2) 
and f(x) are assumed continuous, as indicated above. 

Notice that the non-regular space C is not weakly complete. This 
corresponds to the fact that the limit of the sequence /,,(x) of continuous 
functions convergent at every point of [a, b] and jointly bounded 
(| fn(z) | < m) may not in fact be a continuous function. 


310 METRIC AND NORMED SPACES [103 


2. The weak convergence of elements ¢,(x) € L(g.) (p > 1) to an 
element p(x) € L,(@,) is defined by 


Jim | p(x) pn (x) da = fy (a) p (x) dx (89) 
noo g čo 


for any function y(x) € L,(%). By Theorem 3 of [101], the necessary 
and sufficient conditions for weak convergence in L,(%,) can be 
stated as: (a) the norms of the ¢,(x) are bounded, i.e. 


[Jl (x) dep ZO, (90) 


and (b) (89) holds on the set of elements y(x) of Z,(%,), the linear 
envelope of which is everywhere dense in Lp(&). When (90) is ful- 
filled, it is sufficient say that (89) hold for all the characteristic func- 
tions w,(x) of measurable sets appearing in Z, (Z is a bounded set). 
When &;, is a one-dimensional finite or infinite interval, it is sufficient 
that condition (90) and the equations 


é È 
lim f gn (x) dx = f p(x) dz, (91) 


be satisfied, where c is any fixed number of the interval and is an 
arbitrary point of the interval. 
3. The weak convergence of elements (&{”, é{, ...) € lp (p > 1) 
to an element ($4, >, ...) € lp is defined by 
lim (6, + 6,66 + ...) = bf, + 8,6 4+... (92) 
Noo 
for any element (b,, bz, ...) € lp. The necessary and sufficient con- 
ditions for weak convergence are 


Sapo (93) 
k=i 
EM, (k= 1,2, ...). (94) 


Condition (93) is the usual one, whilst the necessity of (94) follows 
from (92) if we take b; = 0 for i A k and by = 1. Let (93) and (94) be 
satisfied. It follows from (94) that (92) is satisfied on elements of the 
form (0, 0, ..., 0, 1,0, 0, ...) (the base vectors of space lp). But the 
linear envelope of the base vectors is dense in ly, since the cut-off ele- 
ments, all the components of which vanish as from a certain number 
(proper to each element), are dense in ly [59]. Thus the sufficiency of 
(93) and (94) follows from what was said in [101]. 


104] LINEAR OPERATORS AND THE CONVERGENCE OF SEQUENCES OF OPERATORS 311 


104. The space of linear operators and the convergence of sequences 
of operators. We have discussed above the space of linear functionals 
and types of convergence (weak and in the norm) of sequences of 
functionals. Let us turn to the same problems for linear operators 
in a B space X. Let Y be the space of all possible linear operators 
in X with a domain of values in some B space X’ Addition and multip- 
lication by a number are defined as for functionals: 


(A+ B)x = Ax + Bu; (cA)x =c (Az). (95) 


The norm of an element A € Y is the norm || A || of the correspond- 
ing operator. As in [99], Y can be shown to be a B space. 

We now consider a sequence of linear operators A, (n = 1, 2, ...) 
from X into X’. By what has been said, if || 4, — Am || > 0 as n and 
m —> co, there exists a linear operator A such that || 4 — A, || — 0, so 
that we have for any « € X: Anx= Az in X’. The convergence 
|| 4 — A, || > 0 is called a convergence of operators in the norm. 
Here, || An || (n = 1, 2, ...) are bounded, as follows from: || Ap || < 
IA | + 4a — A Il. 

Notice that mutual convergence in norm is necessary and sufficient 
for the convergence in norm || A — A, || > 0, ie. || An — Am || > 0 
as n and m —> co (the space of operators is complete). 

Let us write the obvious inequality 


|| Az — A,2|| < || A — A,||-||2])- (96) 





If x belongs to some bounded set U of space X, there exists a d such 
that || x || < dif x € U, and (99) gives || Az — Ang || < || A — Ap I Id. 
Hence it follows that, given any e > 0, there exists a subscript N 
(depending on e and not on 2) such that || Ax — Ana || < e forn >N 
and x € U,i.e. the convergence of A,x to Az in any bounded set U is 
uniform. We therefore sometimes speak of a uniform convergence of 
operators, instead of a convergence of operators in norm. Let us con- 
sider another convergence of operators. We say that a sequence of 
linear operators A, is strongly convergent to a linear operator A if 
Anz => Ax in X’ for any x € X. 

It may be shown as in [99] that, if A,z is a sequence convergent 
in X for any x € X, the sequence of norms || A, || is bounded, as also 
that the following holds: the sufficient condition for the strong con- 
vergence of a sequence An is that the || A, || be bounded (||A, || < C) 
and that Ang be convergent on a lineal dense in X. Suppose that Anv 
is convergent in X’ for any x € X. Let Az denote the limit of 


312 METRIC AND NORMED SPACES [104 


Anz(Anæ= Ax in X’). The operator A is distributive in X, because A, 
is distributive, and it is bounded because the sequence || A, || is 
bounded, i.e. A is a linear operator. Therefore, if A,x is convergent 
in X’ for any z € X, the sequence A, is strongly convergent to a 
linear operator A. Since X’ is complete, it is sufficient to require the 
mutual instead of the simple convergence of the sequence Anz. Thus 
the space of linear operators is complete not only with respect to con- 
vergence in norm, but also with respect to strong convergence. As al- 
ready remarked above, convergence in norm implies strong convergence. 

There is a third convergence of operators. We say that a sequence 
of linear operators A, is weakly convergent to a linear operator A if 
Anz ¥» Ax in X’ for any x € X. Obviously, weak convergence follows 
from the strong convergence of operators. Strong and weak con- 
vergence are the same for functionals. 

We defined above the addition of linear operators and their multi- 
plication by a number. If A is a linear operator from X into X’ and 
B is a linear operator from X’ into X”, the operator BA, defined by 


(BA)x = B (Ax), 


is a linear operator from X’ into X”. It is distributive because A and B 
are distributive, and bounded because 


|| (BA) x || < | Bl -Az|lx < IB 


Hence it follows that || BA || < || Bl] - || 4 ||. We can also form 
the product of several factors. If A is a linear operator from X into X, 
we can take a positive integral power of it: £? = A(Az) and so on. 

Notice that a product may depend on the order of the factors. If 
say A and B are linear operators from X into X, it is meaningful to 
speak of the following linear operators from X into X:(BA)x = B(Az) 
and (AB)z = A(Ba). These operators may be different. A similar 
remark applies to several factors. 

A strong convergence of operators is sometimes simply called con- 
vergence. We shall use the notation for it: An —> A. Let A, and Bn 
be sequences of linear operators from X into X’ and a be a number 
sequence. It is easily shown that, if a,—>a, An—> A and B,— B, 
then ann —> aA and A, + B,—» A + B. A similar assertion holds for 
convergence in norm. If A, are linear operators from X into X’ and 
Bn from X’ into X”, the fact that A,-> A and B,— B implies 
B,A,~-» BA (the same for convergence in norm). Let us prove the 
last assertion. 





[4 II lle llx- 


105] CONJUGATE OPERATORS 313 


We have 


BAx — B,A,« = (B — B,) (Ax) + B (4 — An) £ 
and 


|| BAx — B,A,a||x* < || (B — B,) (Ax) x7 + || Baill (A — 4n) & Ilx- 


The first term tends to zero, since B, —> B, and the second because 
the || Bn || are bounded and A, — A. 


105. Conjugate operators. Let A be a linear operator from X into 
X’ (X and X’ are B spaces), and /’(x) be a functional in X’ (V € X’*). 

It is easily seen that l’( Ax) is now a linear functional in X: 

U (Ax) =l (x). 

Given A, this equation amounts to a correspondence of an element 
l€ X* to every element l € X’*. We can write this as l = A*l’, 
where the operator A*, defined in the whole of X’* with a range of 
values in X*, is called the conjugate to A. Linear functionals and the 
operator A are distributive, so that A* is distributive. We now show 
that A* is bounded, and that || A* || = || A |}, the left-hand side being 
the norm of the operator in X’* and the right-hand side in X. We have: 


[L(æ) |= [V (Ax) | < 17 |] Axl] < ||] All lel, 


whence || 7 || < |V || - || A ||. But 7 = A*/’, so that || A* || < || 4 |l- 
Further, let x, be any fixed element of X and l’ an element of X’* 
such that || V || = 1 and l’(Az,) = || Az, ||. We get 


|| Aa || = l (Azo) = l (£0) < |Z|- eo |] = 


= LA æo] < A*i E eo = A*i leo l 
ie. || Azo || < || A* || + || zoil, whence || A || < || A* ||, which, in 
conjunction with || A* || < || 4 ||, gives || A* || = || A ||. This leads 


us to the following theorem. 

THEOREM. The operator A*, conjugate to the linear operator A from X 
into X’, is a linear operator from X’* into X* and || A* || = || A |]. 

Note. Notice that, if X’ is the same as X, A* is a linear operator 
into X* with a range of values also in X*. 

If A and B are linear operators from X into X’, it follows from the 
definition of conjugate operator that (A + B)* = A* + B*. If A and 
B are linear operators from X into X, then (BA)* = A*B*, as follows 
from (BAs) = (B*l) (Az) = A*(B*l) (x) = (A*B*l) (x). In the case 
of a real space (cA)* = cA*, whilst for a complex space (cA)* = cA*. 


314 METRIO AND NORMED SPACES {106 


106. Completely continuous operators, A linear operator A from X 
into X’ is said to be completely continuous if it transforms any set 
bounded in X to a set compact in X’. It is easily seen that a distribu- 
tive operator A, defined in the whole of X and transforming every 
bounded set into a compact set, is bounded, i.e. a linear operator. 
For, by hypothesis, A transforms the sphere || æ || < 1 into the com- 
pact set Ar. But every compact set is bounded, i.e. there exists a 
positive number C such that || Ax || < C for || z|| < 1, whence it 
follows that A is a bounded operator. The definition of completely 
continuous operator can therefore be stated as follows: a distributive 
operator A, defined in the whole of X, is said to be completely contin- 
uous if it transforms every bounded set into a compact set. It follows 
from what has been said that the completely continuous operator 
thus defined is in fact linear. 

THEOREM 1. If A is a completely continuous operator and xn“ z£, 
then Ag, => Az, We know that Az, Az,, if A is a linear operator. 
By hypothesis, £n “> £o, 80 that the number sequence || £n || is bounded. 
In view of the complete continuity of A, we can extract from the 
sequence Az, a subsequence which is strongly convergent to some 
element y, € X’. On the other hand, it follows from what has been 
said that this subsequence is weakly convergent to Az,, so that 
Yo = Ax Thus every strongly convergent subsequence of the se- 
quence Az,, is strongly convergent to Ax, We have to show that the 
whole of this sequence is strongly convergent to Az». 

We use reductio ad absurdum. Suppose that there exists a number 
ô > 0 and an infinite subsequence Az, such that || Aan, — AZ, || > 6. 

We can extract from the sequence Az, a strongly convergent sub- 
sequence, and this subsequence must converge strongly to Az, as 
indicated above, which contradicts the inequality || Av,,— A%, || > 
> ô > 0. The theorem is proved. 

THEOREM 2. Let Am (m = 1, 2, ...) be a sequence of completely con- 
tinuous operators which converge in norm to the linear operator A 
(|| 4 — Am || —> 0). The operator A must be completely continuous. 
We have to show that A transforms every bounded sequence of 
elements z, € X (n = 1, 2, ...) into a compact sequence. Given any 
fixed m, the sequence Amtn is compact. On the other hand, the con- 
vergence in norm implies the uniform convergence on every bounded 
set. Thus, given any e > 0, there exists an m such that || Av, — Amn || 
< £ (n = 1, 2, ...), ie. Az, has a compact € net Amtn, whence it 
follows that Az, is compact. 


107] OPERATOR EQUATIONS 315 


It can be shown that, if A is a completely continuous operator, then 
A*, defined in X’* and having a range of values in X*, is also 
completely continuous. 

We shall prove this later for operators in Hilbert space. The theory 
of completely continuous operators will be treated in more detail for 
this space. 


107. Operator equations. We shall now assume that a linear operator 
A, defined in space X, has a range of values also belonging to X 
(X’ coincides with X). Now, A* defined in X* has a range of values 
also in X*. As above, we shall write Æ for the operator of the identity 
transformation into any B space, i.e. Ex = x for any æ € X. We also 
recall that the annihilation operator is a linear operator transforming 
any element x into the zero element. Its norm is equal to zero, whereas 
that of any other linear operator is positive. 

We take the equation 


(A—E)z=y, (97) 
where y is the given and 2 the required element of X. We rewrite (97) as 
a= Ax — y. (98) 


The right-hand side of this equation is an operator from X into X. 
We write Bx = Ax — y (B is not a linear operator). We have Ba, — 
— Bx, = Ax, — Ax, whence || Br, — Bz, || < || A I| + [z — z ||. 
If || A < 1, the principle of compressed mappings is applicable to (98). 
We get the following result. 

THEOREM. If || A || < 1, given any y € X, equation (97) has a unique 
solution, and this solution can be obtained by the method of successive 
approximations from equation (98) with any initial approximation. We 
shall often be concerned below with an equation containing a para- 
meter: 


(A — 2E) z =y. (99) 
If X is a real space of type B, A is a real number. It may be complex 
for complex spaces. Assuming å # 0, (99) can be written as (= A— 


— E)z = 1/2 + y. It follows from the last theorem that, if | A | > || A ||, 
(99) has a unique solution for any y, and this solution can be obtained 
by the method of successive approximations from the equation 


1 1 
w= >Ar— TY. 


316 METRIO AND NORMED SPACES (107 


Equation (97) is occasionally written as 
(AA — E) x =y or (E—AA) xz =y. 


The condition |å | > || 4 || is here replaced by |A|< || 4|. 
Now let A be a completely continuous operator, and let us write 
two equations, one in space X and the other in X*: 


(4— E)r =y (100) 
(A* — E) x* = y*, (101) 


where y and y* are the given, and z, z* the required elements. 
We also write down the corresponding homogeneous equations 


(A— E)x=80 (102) 
(A* — E) x* = 0*, (103) 


where 6 and 6* are the zero elements in X and X* respectively. The 
sets of solutions of these equations are lineals. We shall next state 
the results regarding the equations written. We shall prove them in 
the next chapter for the case of Hilbert space. 

If one of equations (100) or (101) has a solution, given any right-hand 
side, the other equation has the same property. In this case, given any 
right-hand side, each equation has a unique solution, i.e. (102) and 
(103) only have the trivial solutions x = 0 and z* = 6*. 

If one of equations (102) or (103) has a non-zero solution, the other 
has this property, and the number of linearly independent solutions 
is finite and the same for (102) and (103). The lineals of the solutions 
are finite-dimensional subspaces. Here, the necessary and sufficient 
condition for (100) to be soluble is that y be orthogonal to all the 
solutions of (103), whilst the necessary and sufficient condition for 
(101) to be soluble is that y* be orthogonal to all the solutions of (102) 
LIV; 9]. 

These results are naturally preserved for equations with a para- 
meter. In the case of a complex B space, the equations are written as 


(A—AE)x=y (104); (A* — 2E) z* = y*; (105) 
(4—1îE)x=0 (106); (4* —3E)x*=0*. (107) 

A fur her result must be mentioned. There is only a finite number 
of values of A, satisfying the condition | å | < R, where È is any given 


positive number, for which (106), and hence also (107) have non-trivial 
solutions (not x = 0 or a* = 6*). 


108] COMPLETELY CONTINUOUS OPERATORS IN C, Lp AND lp 317 


The above results are precisely analogous to the theorems that we 
had in the theory of integral equations. 

If X is a real B space, 2 must be taken as real and A = å. 

A å for which (106) has non-trivial solutions is called an eigenvalue 
of the operator A, and the number of linearly independent solutions 
of the equation is called the rank of the eigenvalue. The solutions 
Tis o, ..-,m form a complete set of linearly independent solutions 
of (106) if the general form of any solution of the equation is x = 
= €,%, + Cz to +... + Cm Em, where the cx are arbitrary numbers. 
The representation of any solution x in this form is unique, in view 
of the linear independence of the zy. The linearly independent solutions 
can be chosen in different ways, but the number of solutions is always 
the same in a complete set. If (106) has non-trivial solutions and the 
element y appearing in (104) satisfies the above-mentioned solubility 
condition, all the solutions of (104) are expressible by 


£ = Ny + CH, F Coy F- - -F Cy Bins (108) 


where x, is any given solution of (104), 2, 2%, ...,%m is a complete 
set of linearly independent solutions of (106) and cx are arbitrary 
numbers. All this follows directly from the linearity of (104) and (106) 
{IV; 9, 10]. 


108. Completely continuous operators in C, Lp and /,. 1. We consider 


the integral operator 
b 


p (z) = Í K (x,t) p(t) de (109) 
a 
in space C, where [a, b] is a finite interval. If the kernel K(x, t) is con- 
tinuous in the square Q [a < x < b;a< y< bj, (109) obviously 
yields a distributive operator from C into C. The fact that it is bounded 
follows at once from 


agxgsb 


b 
max |p(z)|< max |p(t)| max f |K(z,t)|dt. (110) 
atb aļxŞSó Gg 


If U is a bounded set of functions y(t) in C, i.e. max | y(t) | < A, 
axt<xb 

it is readily seen that the set of corresponding g(x) is compact. Its 

boundedness follows at once from (110), since max | p(t) |< 4, 


whilst the equicontinuity follows from a 


~aes 


b 


| p (z3) — p (x1) | < 4 {|E (£3, t) — K (ay, t) | dt. (111) 


318 METRIO AND NORMED SPACES [108 


Thus operator (109) is completely continuous in C when the kernel 
is continuous in Q. 
It can be shown that the norm of operator (109) is strictly equal to 


b 
max f|K (x, t)|dt. 
aļx[b à 
Operator (109) is completely continuous in C with fewer assumptions 
regarding the kernel. Suppose e.g. that K(x, t) is a bounded function 
measurable in the square Q and that 
lim K (x',t) = K (x,t) (112) 


xx 


for any x of [a, b] for almost all ¢. Now [54]: 


b 
lim f]K(x',t)— K (x,t)|dt = 0 


xx a 
and, given any € > 0, there exists an 7 > 0 such that 


S| K (£x t) — K (ay, t)|dt < e for |x, — x| <7. 
a 


This last is proved in the same way as the uniform continuity of a 
function continuous on a finite closed interval (I; 43]. 

The proof that g(x) is bounded and equicontinuous is the same as 
above. We shall later investigate in detail integral operators with 
polar kernels. 

2. Let us now consider operator (109) in L, (p > 1) on theassumption 
that the kernel K(z, t) € Z,,(Q), ie. 


b b 
f f |E (z, t) P’ dt dt = AP < + 00, (113) 
aa 


If y(x) is any function of L, [a, b], integral (109) has a meaning. 
It is easily shown that it defines a measurable function g(x) [cf. 68). 
We have by Holder’s formula: 


b 
lp (2)| < |f |E (æ, t)” at 


LT b q 
P| Sly (e)iede 








On raising both sides to the power p’ and integrating with respect 
to x, we get 


Wolk < A” vie, ie elle <Allvile = (114) 


108] COMPLETELY CONTINUOUS OPERATORS IN C, Ip AND lp 319 


i.e. (114) is a linear operator from L, into Lp. It can be shown that A 
is the norm of this operator. Let us show that (114) is completely con- 
tinuous. Let U be a set of functions y(x) which is bounded in L,, and 
V the set of corresponding p(x) € Lp. We have to show that V is 
compact. By hypothesis, ||% ||Z, < C, ify(x) € U, and it follows from 
(114) that || p ||Lp < AC. It remains to show that (x) are equicon- 
tinuous in the mean. On extending g(x) by zero outside [a,b], and 
K(x, t) by zero outside Q, we have 


ola + h) — oa = fie +h, t) — K(x, t)] y(t) dt 


whence, as above, 
bb A 
lipl@ + h) — glx)llep < [f [| Ka + h.t) — K(x, t)” da dt] |ly||,, 
aa 


le. 
b b 4 
lele +h) — e(2)llep <[ f f IE( + h, t) —K (x, t dedi] o. (115) 


Since K(x, ¢) is continuous in the mean in Lp on Q, given any 
€ > 0, there exists an 7 > 0 such that 


b b 
pt 
[{|K@ + h, y) — Kiz, y)” ddt < or for | <n, 
aa 


and it follows from (115) that 
llp(a + h) — 9(a)||pp <e for |A < 


where 7 is the same for all g(x) € V, which is what we set out to prove. 
The proof is similar for the case of an infinite interval and several 
independent variables. 

3. We now consider the operator from I, into lẹ (p > 1) given by 


Ni = 46, + anë +... (116) 
on condition that 
D> laa” = AP < +00. (117) 
i, k=l 


On using the notation 2(£,, &, ...) and y( M, 7, ..-) for the elements 
and applying Holder’s inequality for sums, we obtain, precisely as 
above, 

lalis. < A ell,» (118) 


320 METRIC AND NORMED SPACES [109 


so that operator (116) is a linear operator from J, into lp. Let us show 
that it is completely continuous. Let U be a bounded set of elements 
x El (|| || <C) and V the corresponding set of elements y € lp, 
We have to show that V is compact. It is bounded by virtue of (118), 
and it remains to show that, given any e€ > 0, there exists a positive 
integer n, such that 


oo 


D mi” <e. (119) 
i=ng 
We have by (116) and Hélder’s inequality: 
> inl! < SS Sleek elle < SS la|” CP". (120) 
i=ne ing k=l i=ng k= 


By (117), the double series with general term | aj, |? ‘is convergent, 
so that there exists an n, such that 


oo 


oo pe 
> > ital” <a> (121) 


ine k=1 
whence (119) follows, in view of (120). 


109. Generalized derivatives. We shall now introduce a new type 
of derivative, which is often employed in modern mathematical 
physics. Let D be a bounded domain of n-dimensional Euclidean 
space Rp, a point x of which is defined by the Cartesian coordinates 
(£i To ++, Zn). We shall always take a domain to mean an open 
connected set, and shall assume that the boundaries of any domains 
discussed have zero volume measure. As usual, we shall write D for 
the domain D along with its boundary (the closed domain). We shall 
say that D’ lies strictly inside D if D’c D and the distance from D’ 
to the boundary of D is positive. This is equivalent to the fact that 
D’c D. As earlier, we shall describe a function as finite in D if it is 
zero outside some domain D’ lying strictly inside D (D’ may be 
different for different functions). Let g(x) have continuous derivatives 
up to order l inside D and let y(x) be finite. Let us consider a derivative 
of order l: i 

y= 3 p 
noS Barb Iah.. Oal | ue) 

Using the formula for integration by parts, and the fact that y(x} 

is finite, we obtain 


{ D' ple) ya) dx = (—1 } f (ew) D' y(x) de. (123) 
D D 


109] GENERALIZED DERIVATIVES 321 


A more general concept of derivative can be based on (123). 
DEFINITION 1. Let g(x) and y(x) be summable over any subdomain D’ 
lying strictly inside D, and let 


fx) x) dz = (— 1) YS eta) D' p(x) dz. (124) 


for any finite l times continuously differentiable function y(x). 

In this case, y(x) is called the generalized derivative of the form (122) 
of p(x) in D. 

Let us show that only one generalized derivative of a given form can 
exist for any given g(x). Let y(x) and y(x) be two generalized deriva- 
tives. Equation (124) holds for both g(x) and y,(xz). Term by term 
subtraction gives 

f [x (£) — x1 (x)] y(x) dz = 0, (125) 
D 


whence it follows, since the finite function y(x) is arbitrary, that 
x(x) and y,(x) are equivalent functions in D [71]. 

If g(x) has continuous derivatives up to order / inside D, (123) 
holds and y(x) = D'g(x). We shall retain the notation (122) in future 
for the generalized derivatives. Let us note some properties of the 
generalized derivative that follow directly from the definition. The 
generalized derivative D'g(x) does not depend on the order in which 
the differentiations are written, since the order of differentiating y(x), 
which has continuous derivatives, is arbitrary in (124). If (x) and 
p(x) have generalized derivatives y,(z) and x(x) of type (122), 
Cy p(x) + Cy p(x) has the generalized derivative c, y,(z) + cz x(x) of 
the same type (c, and c, are constants). If y(x) is the generalized 
derivative of g(x) in D, it will be the generalized derivative of the same 
type in any domain D’ belonging to D. 

If p(x) has a generalized derivative d¢(zx)/dx, = x(x) and x(x) has 
a eee derivative dy(x)/0x,, p(x) has a generalized derivative 

p(x)/Oa, Ox, = 9x(x)/dx,. Similarly for the other types of derivative. 
ae if p(x) has generalized derivatives 0¢(x)/dz, and 6? y(x)/0x,02%2, 
then 0°9(x)/dx,07, is the generalized derivative of d9(x)/dz, with 
respect to qı. Below, we shall also prove that, given certain auxiliary 
restrictions, the usual formula for differentiation of a product holds 
Ilp: (e)P2 (1)] _ 3p (x) qa (2) + p, (x) ee Pe (126) 


On, Ox, 





We now establish the connection between scsi differentiation 
and the averaging operation. Let w,(| x — y |) be any given averaging 


322 METRIC AND NORMED SPACES [109 


kernel, depending on the distance beween the points x and y, and 
p(z) the mean functions formed for g(x): 


pr (2) = Fy f oa (E — ul) ety) dy. (127) 


Assuming that p(x) has the generalized derivative y(x) = D'ẹ(x) 
of type (122) in D, let us work out the corresponding (obviously ordin- 
ary) derivative of the mean functions [71]: 


1 
D} pr (£) = an fw) Dy wn (|v — yl) dy = 


_ 7 
=‘ = foo Di, w, (læ — yl) dy. (128) 





We shall assume that the point x € D is at a distance greater than h 
from the boundary of D. Since the function w,( | x — y |) vanishes 
outside a sphere of radius A with centre at the point v, it can be taken 
as the finite function in (124), Together with (128), this leads to 


Di pala) = sa f 0a (le — vl) D5 (9) dy, (129) 


which can be stated as: the mean functions of the generalized deriv- 
atives coincide with the derivatives of the same type of the mean 
functions at all points of the domain D whose distance from the 
boundary is greater than the averaging radius. 

We can now say, on the basis of the properties of the mean functions 
[71] that, as h—> 0, g(x) —> (x) and D'g,(x)—> D'g(z) in L(D’), 
where D’ is any strictly interior subdomain of D. Furthermore, if we 
make the supplementary assumption that p(x) is summable over any 
strictly interior subdomain D’ to any given degree p > 1, and the 
generalized derivative D'g(z) to any given degree q > 1, we have 
convergence of g(x) and D'g,(x) in L,(D’) and L£,(D’) respectively. 
A word of warning. Suppose the definition of p(x) is somehow extended 
to the whole of Rp, e.g. it is put equal to zero outside D. The g(x) are 
now also defined in the whole of space and converge to ¢(x) in L,(D) 
as h—> 0. But the functions D'p,(zx) will not in general be convergent 
to D'g(x) in space L,(D). This is bound up with the fact that the 
extended function g(x) may not have the corresponding generalized 
derivative throughout Rp- 

Let us now turn to the proof of (126) for the differentiation of a 
product. We must first prove a simple proposition. Let g (£) € L,(D’) 
(p > 1) and p(x) € Lp(D’) (1/p + 1/p’ = 1) in any strictly interior 


109} GENERALIZED DERIVATIVES 323 
domain D’ of D and let y(x) be a bounded function finite in D. Then 
S Pin (2) Pon (£) p(x) hase f Pr (2) Pa (£) pla) dae. 

D 9D 


In fact, we find by using Hölder’s inequality: 
|) [in (£) Pan (®) — P1 (2) pa (&)] y(z) dz| < 
D 


< (|Pin(%)|+| Pan (2) — p (2) | | p(a)| dæ- 
D 

F f [Pa (%)| + (Pan (2) — P1 (x)| | p(x) da < 
D 


< Cf |lpralle,co7° Pzr — Peller + (IPallegwa Il Pan — P llesol- 

Here, C = sup | y(x) | and D’ is the subdomain of D outside which 
y(x) vanishes. The right-hand side of the last inequality tends to zero 
as h-» 0, since py (x) > ¢,(x)inLZ,(D’), pn (x) —> p(x) in L,(D’) and 
the sequence ,,(z), convergent in L,(D’), is bounded in norm in L,(D’). 

We shall now establish (126) on the assumption that p(x) and 
09,(x)/dx, € L,(D’), whilst p(x) and 69,(x)/Ox, € Lp(D’) for any 
strictly interior subdomain D’. Making use of the previous proposition, 
we have for any continuously differentiable finite function y(x), 
vanishing outside D’ [62]: 











a i 

Io (2) pa (2) i dz = um Jou (2) Pan (X) aa da 
so that 

ə oran 
for (a) px (=) 82 de = — lim [EREL 9, (2) + 
i ho 1 
p D 
+ Pir (2) ar el pla) dz. (130) 


Given sufficiently small h, we can apply (129) in D’ and replace 
Oprn(X)/Ga, in (130) by (Op;(x)/9x)n, and By2n(x)/Ox, by (Op,(x)/Ox)p. 
On again applying our auxiliary proposition to the right-hand side of 
(130), we get 


ð Cy, a 
fo. (2) pa (x) PEE dae = — f[ B®, (a) + p (x) A | wie) de. 
D D 








This last equation implies that the product ¢,(z)p,(z) has a general- 
ized derivative with respect to x, in D, which can be evaluated in 
accordance with (126). 


324 METRIC AND NORMED SPACES [109 


Notice that (126) also holds for p = 1. In this case we have to take 
p’ = œ, i.e. assume that p(x) and d9,(x)/dx, are bounded in any 
subdomain D’. 

We now show that a second definition of generalized derivative 
can be given, and its equivalence to the original deflnition established 
on the basis of (129). 

DEFINITION 2. The function y(x) ts called the generalized derivativ 
of type (122) of a function (x) in D if there exists a sequence of functions 
@m(x), l times continuously differentiable inside D, such that y(x) and 
D'o (2) are convergent to g(x) and y(x) respectively in L(Dy, where D’ 
is any strictly interior subdomain of D. 

THEOREM 1. Definitions 1 and 2 are equivalent. Let y(x) be the 
generalized derivative of p(x) in accordance with the second definition. 
Equation (123) holds, when g(x) is replaced by g(x), and, since 
Pmt) —> p(x) and D'op(x) —> x(x) in L(D’), given any choice of finite 
iunction y(x) with the above-mentioned properties, we can pass to the 
fmit under the integral sign [cf. 62], whence (124) follows. 

Now let y(x) be the generalized derivative of g(x) in the sense of 
the first definition. By (129) and Theorem 4 of [71], the sequence of 
Ym(z) required by the second definition is given by the mean functions 
Yn, (x), given any sequence Am tending to zero (we are assuming that 
g(x) is continued by zero outside D). Theorem 1 is proved. It follows 
from this theorem that the generalized derivative is unique if it exists, 
in the sense of the second definition. 

We now prove a theorem to the effect that generalized derivatives 
are capable of weak convergence in L,(D’). 

THEOREM 2. Let y(x) (k = 1, 2, ...), defined inside D, be weakly 
convergent to a function p(x) in L,(D’) (p > 1), where D’ is any domain 
lying strictly inside D, have generalized derivatives D'¢,(a) of form (122) 
in D and norms D'p,{x) in L,(D’) bounded by some number M(D’), 
which depends on the choice of D’. Then g(x) has a generalized derivative 
D'o(x) of form (122) in D, equal to the weak limit of D'p{z) in LD’). 

Proor. In view of the we akcompactness of bounded sets in Lp for 
p > 1, the inequality 


|| D! ¢,|\2,00) < M(D') (131) 


implies the existence of a subsequence @p,(z) such that D'pn (2) are 
weakly convergent in L,(D’). By taking a sequence of strictly interior 
expanding domains Dm convergent to D, we can form with the aid of 
a diagonal process a subsequence D'¢m,(2) for which the derivatives 


110] GENERALIZED DERIVATIVES (CONTINUED) 325 


D'pm,(2) are weakly convergent in L,(D’) to some function x(x) in any 
strictly interior subdomain D”. It is clear that x(x) is defined every- 
where in D and belongs to Z,(D’) for a strictly interior domain D’. 

Equation (123) holds when g(x) is replaced by pm,(x). On passing to 
the limit in it with y(x) fixed, and observing that y(x) is finite, we 
arrive at (124) (weak convergence), whence it follows that y(x) is the 
generalized derivative of g(x) in D. It follows from what has been said 
that any weakly convergent subsequence D'pm, (£) has the same limit 
x(x) (the generalized derivative is unique), and we can easily conclude 
from this that the entire sequence D'g,(2x) is weakly convergent to (2). 

Notes 1. This last theorem shows that, if g(x) € Z,(D’) and (131) 
holds for the derivatives of the mean functions »,(z), there exists in D 
the generalized derivative D'y(zx) € LD’). We have already seen 
that in this case D'p,(x)—> D'p(x) in L,(D’), so that the norm of 
D'p(x) satisfies (131). 

2. In the conditions of the theorem, functions g(x) and D'g(z) may 
belong to L,(D’) and L,(D’) respectively, with p # q. 

3. Theorem 2 remains in force with p = 1, if, instead of (131), we 
assume the weak compactness of functions D',(x) in L(D’) for any 
strictly interior subdomain D’ of D. 


110. Generalized derivatives (continued). Let us now establish the 
connection between the existence of the generalized derivatives 
and the absolute continuity of functions. We take the case of one 
independent variable and 0 < 2 < 1 as the fundamental domain D. 
Let p(x) be absolutely continuous in [0,1]. As we know from [74], 
g(x) has a derivative g'(x) in [0, 1], which is summable in [0, 1]. The 
formula for integration by parts [74] gives, for any continuously 
differentiable finite function y(x), 


1 1 
f pla) y’ (w)da = — f 9 (x) y(x) de, (132) 
0 0 


which shows that g'(x) is the generalized derivative of g(z). 

Now let (x) € L ([0, 1}) and have a generalized derivative dgy(x)/dx 
in D, belonging to L ([0, 1]). Let us show that g(x) is now equivalent 
to some function absolutely continuous in [0, 1). 


We write 
x 


d 
p: (2) = (at 
0 


326 METRIC AND NORMED SPACES [110 


and observe that ¢,(x) is absolutely continuous and that its derivative 
9i(x) is equivalent to dy(x)/dz [74]. The difference g*(z) = p(x) — 
— g(x) obviously has a derivative equivalent to zero. We fix € > 0 and 
consider the interval [<, 1 — €]. Given sufficiently small A, the deriv- 
ative of the mean function g}(x) is equal to zero in [e, 1 — €], so that 
gi(z) is constant in [e, 1 — £]. Since a limit of constants must be 
constant, and gf(x) —> y*(a) in L ([e, 1 — e]), p*(x) is equivalent to a 
constant in [e, 1 — €]. Hence it follows that 


ola) = p(0) + [ge Bet) ae (133) 


everywhere in D, discounting equivalence. We have thus established 
that the existence of a generalized derivative is equivalent to the ab- 
solute continuity of g(x). It may be shown similarly, for the case 
of several independent variables, that, if y(x,, %,..., n) has a 
generalized derivative dg(x)/éx, say in the cube [0 < £p < l; k = 
= 1, 2, ..., n] and g(x) and d(x)/dz, € Lp (p > 1) in this cube, then 
g(x) is absolutely continuous for 0 < q, < 1 for almost all values of 
(Tas T3, ..-, Zn) in the cube [0 < £p < l; k = 2,3, ..., n] and we have 


x 
OP(t, Le, os 
PlBr Bar- -> Bn) = P(O, ay -o cr Bp) + S ERER ay (134) 
0 


for all x, of [0, 1]. 

This equation, like (133), needs some explanation. The function (x) 
and its generalized derivative Dy(x) are defined up to a factor 
of measure zero; thus (133) and (134) have to be understood in the 
sense that there are functions of the class of functions equivalent to 
g(x) for which these equations hold. 

We shall now give an example of a function 9(7,, 2), having a 
generalized mixed derivative 6? y(2,, 7,)/dz, 02,, but not having 
generalized first derivatives. The function (Tı, £) = f(2,) + f(T) 
(0 < a <1; k = 1, 2) has this property, where f(x) is the continuous 
function of [76]. The function (2,, 7.) has no generalized first deriv- 
atives, since f(x) is not absolutely continuous. Whereas the generalized 
derivative 3? 9(2,, 2,)/dax, Oz, exists and is exactly equal to zero. For, 
given any smooth finite function »(z,, tı), we have 
1 1 alfa p(z, La) 

[ren ) Sree da, dx, = Jas, J | (e) E | da, = 0. 


0 





pera ice 


110] GENERALIZED DERIVATIVES (CONTINUED) 327 


and similarly for /(xq), i.e 
11 5 
ð Zi 2) 
{ju a) + fe] MEg de, de, = 0, 


whence it follows (definition 1) that the generalized derivative 


3? p(x, 12) zô 

Oa, Ox, j 
It is worth noticing that, if p(2,, 2, ..., n) is continuous in D 
and if D can be divided with the aid of a finite number of smooth 
surfaces into a finite number of domains D; (i = 1, 2, ..., 2), in each 
of which (zx) is continuously differentiable with respect to an 2, as 
far as the boundary, then g(x) has a generalized derivative in D 
equal to dg(x)/dx, in each of the D;. This derivative can have discon- 
tinuities of the first kind on the above-mentioned surfaces. Our 
assertion follows at once from the formula for integration by parts: 


fue) E ds = — = ee —— y(x) dx + J plx) p(x) cos (n, zy) dS, 


by Dy (135) 
where 9; is the boundary of D; and n the direction of the normal to 
Sı outward with respect to D,. We only need to observe that the 
integrals over the surfaces S; cancel on summation over i. 

If p(x) has different limiting values on an (n — 1)-dimensional piece 
& of surface, lying in D, and the direction x, does not lie in the tangent 
plane to this surface, no generalized derivative dy(x)/da, exists in D. 
This follows from the connection established above between the 
absolute continuity of p(x) and the existence of the generalized 
derivative. 

Note. Instead of the concept of a separate generalized derivative 
of a summable function g(x), we can introduce the concept of a gener- 
alized linear differential operator of any order, say 








7 ə a 
Ly) = > ax abe + 54% = + cy (a), (136) 
k=l 


where the coefficients are sufficiently smooth functions of (x, £a ..., 
Zn). 

Such a generalized operator is defined by an equation analogous 
to (124): 


(p(x) M(p) de = f Llp) ylz) de, (137) 
D D 


328 METRIC AND NORMED SPACES [111 


where M(y) is the conjugate differential operator and y(x) is any 
smooth function finite in D [IV; 158]. The existence of the individual 
derivatives appearing in the operator L(g) is not assumed here. 


111. The case of a star-shaped domain. We have shown [109] 
with the aid of mean functions that, given any g(x) of L,(D’) (p > 1) 
having a generalized derivative y(x) = D'9(x) also of L,(D’), there 
exists a sequence of p(x), l times continuously differentiable in D, 
such that p(x) —> p(x) and D'y,(x) —> x(x) in LD’). (Here, as above, 
D’ is any strictly interior subdomain of D). We now show that an 
analogous approximation of functions g(x) and D'¢(zx) is also possible 
in space L,(D) for an important class of domains. 

We shall describe D as a star-shaped domain if there exists an interior 
point z, such that every radius vector from 2, cuts the boundary in 
only one point. It may also be said that the domain is star-shaped 
with respect to the point 2». 

THEOREM. Let D be a star-shaped domain and g(x) have a generalized 
derivative D'y(x) in D, where g(x) and D'g(2x) belong to LD) (p > 1). 
Then there exists a sequence p(x) of functions l times continuously dif- 
ferentiable in D such that p(x) and D'gi{2) are convergent to g(x) and 
D'y(x) in LD). 

We speak of functions y;(z), l times continuously differentiable in D, 
if g(x) are continuous in D, are continuously differentiable inside D 
up to order J, and their derivatives can be given a supplementary 
definition on the boundary of D in such a way as to obtain functions 
continuous in D. 

We take the above 2, a8 origin and form the sequence of functions 
g([k — 1]/k) x (k = 2, 3, 4, ...), defined inside the domains Dx, con- 
taining D strictly inside themselves, D; being got from D by a similitude 
transformation with similitude coefficient k/(k — 1). 

We write o([k — 1]/k) x= g(x) and show that the g(x) are con- 
vergent in L,(D) to p(x), whilst the generalized derivatives D'y” (x) are 
convergent in L,(D) to x(x) = D'g(z). Let us prove say the second 
assertion. We have 


pore) — Die eh =[ feo- EREE e< 


<li -5 IMa (e a a+] ko ae (t=) h 


112] SPACES Ü AND wi!) 329 


The distance between the points ([k — 1]/k) x and z does not exceed 
d/k, where d is the diameter of D, and consequently tends to zero 
uniformly in D. On repeating the argument of the theorem on con- 
tinuity in the mean [70], the second term on the right-hand side of 
the last inequality will be seen to tend to zero as k — oo. 

The factor 1 — ([k — 1]/k)' in the first term tends to zero, whilst 
the second factor is convergent by what has been said to || ¥(2) ||:,.p) 
(continuity of the norm). The proof of the convergence of ox). to 

g(x) in L,(D) is even simpler. We observe that, with k fixed, D is 
strictly interior with respect to D,, so that the mean functions h(a) 
are convergent to g(x), whilst their derivatives D'g{\?(z) are con- 
vergent to D'¢(z) in L p(D) as h — 0. Hence it follows that (x ) and 

y(z) can be approximated in the metric of L (D) by functions pi (a) 
and D'gf(x) infinitely differentiable m D with a suitable choice of 
the sequence hg — 0. The functions ph a) can be taken as the p(x) 
of the statement of the theorem. The theorem is proved. 


112. Spaces fV and W. As above, let D be a bounded domain 
of n-dimensional space. We consider the set of all functions (2) 
having all generalized derivatives of order 1, where p(x) and all the 
D'y(z) belong to L,(D) (p > 1). This class of functions will be denoted 
by WD); it can be converted into a normed space by introducing 
the norm in accordance with the formula 


lela w= =Jire ) Pda + f 


Bth+.. È ai 


oly (x) 


p 
—_——_ | dz, 138 
@alt... Gach,” j (198) 








Here and below, Pea denotes summation over all possible 
h+hF.- -=l 
sets of natural numbers (l, l>, - . -, ln), the sums of which yield /. The 
fundamental properties of the norm, as indicated in [95], may easily 
be verified. Let us show that space we is complete. Let p(x) be a 
mutually convergent sequence in WD), i.e. 


ayy, (x) a pm (2) 
x) — x) |P Ls 
A [|e ) = Pm (@)|? + bette ge Owl... dz” Oat. ale 








"Jiz 0 


as k and m— œ. Hence it follows that the sequences ,(z) and 
D'y,{x) are mutually convergent in L,(D). In view of the completeness 
of L (D) and Theorem 2 of [109], we find that p(x) are convergent in 
L (D) to some function g(x), this latter Raving all possible generalized 
derivatives of order / from L, (D) and D'g,{x) — D'g(x) in L,(D). 
What has been said is equivalent to the convergence of p(x) to g(x) 


330 METRIC AND NORMED SPACES [112 


in wY (D). Thus WD) i is a B space (a complete linear normed space). 
Let us dwell on the proof that WD) is separable. To this end, we 
represent D as a denumerable set of non-overlapping semi-open inter- 
vals [32]. We enumerate the corresponding open intervals, written 
as D; (k = 1, 2, ...) and introduce the set VOD) of functions g(x), 
belonging to W9 (Dy) in each interval Dp and such that the series is 
convergent: 


ile hase) = D I p ligo . (139) 
k=1 


Specification of the norm in accordance with (139) converts V“(D) 
into a linear normed space. It is easily seen that functions of W“(D) 
belong to V§(D) and || 9 ligpm = li p IVg Thus WP(D) is a 
subspace of VD), and it is sufficient for us to show that the latter 
is separable [94]. 

The set of functions of VOD), differing from zero only in a finite 
number of intervals, is dense in V{(D). For, let (x) € € Vp(D) and 


€ > 0 be an arbitrary number. We put 9,,(z) = g(x) for x cS D; 


and g(x) = 0 in the remaining part of D. Obviously, pm(z) € VOD), 
and for sufficiently large m: 


|| P — Pm IP vp) = > |? Ia (On <E 
k=m+1 


in view of the convergence of series (139). The functions m(x) can be 
approximated in the metric of PND) in each of the intervals D, 
(k < m) by l times continuously differentiable functions [111], and 
in turn the latter can be uniformly approximated together with their 
derivatives in D, by polynomials with rational coefficients. Hence it 
follows that the set of functions, each of which differs from zero only 
in a finite number of intervals D, and coincides with a polynomial 
with rational coefficients in each of these, is dense in YOD). It is 
easily seen that the set of such functions is denumerable, i.e. VD) 
is separable. This proves that space W'(D) is separable. 

We must now dwell on a special problem. Suppose that, in addition 
to the basic norm || 2z ||, another norm ||% ||, is introduced into a 
linear normed space X, where for all x € X: 


eallæli < lelki < ele lf (140) 


and c > 0, cĉ > 0 are constants. Norms satisfying condition (140) 


112] SPACES Wi) AND wih 331 


are said to be equivalent. Obviously, a sequence z,, convergent in one 
norm, is convergent in the other. It is unimportant which equivalent 
norm is considered when deciding about denseness, separability, 
compactness etc. Similarly, a distributive operator, bounded in one 
norm, is also bounded with respect to another (equivalent) norm. 
Generally speaking, the norm of a bounded operator changes when 
passing to an equivalent norm in the space, but remains finite. 

We must now consider in more detail the question of equivalent 
norms in n-dimensional real Euclidean space R,. We have introduced 
a norm into R, in accordance with 





jel = Vat + ad +... + a. (141) 


Now let the function f(x) = f(t, £a ...,%) have the properties 
of the norm [95] and, in addition, be continuous on the surface of the 
unit sphere aj+a;+...-+a2=1. We show that || 2 ||, = f(z) 
represents a norm equivalent to (141). The continuous f(x) attains its 
maximum and minimum on the surface of the unit sphere. Let us 
write c, = sup f(x) and c, = inf f(x) for || x || = 1. Since f(x) is positive 
and continuous, we have 0 < c, < ¢& < +œ. In view of the pro- 
perties of f(x) everywhere in R,, 


ollel < f(z) < ey [fal], 


i.e. norms (141) and f(z) are equivalent. 
We can take as f(x) say 


f (2) ~|3 Jae? P(e > 1) or f (2) = max |s] 
k=1 


It can easily be shown, on the basis of these remarks, that the norms 
. a =, ( D 
given in W,’(D) by 


lel= 5 IPelamo +t ]leil o (142) 
h+...=hh=!1 
lol = max ||D'o|iz.m+lelliw (143) 
i+... 4h,=1 
P A 
lel={ fE 5 Dpr] dz} + iello (144) 
D k+...tln=l 


are equivalent to the basic norm (138). We shall make use of all this 
later. 


332 METRIC AND NORMED SPACES [113 


We now introduce a further function space. Given a set of functions 
having all possible generalized derivatives in D up to and including 
order l, and belonging to Z,(D) together with their derivatives, we 
define the norm by 


I 
lelWPm=S SS Dell. (145) 


k=0k,+...+k,=k 


Let WD) denote the linear normed space obtained. It can be 
shown, in the same way as for W(D), that our new space is complete 
and separable. The above remark about equivalent norms in WD) 
applies in equal measures to WD). We shall show below [116] that, 
for a fairly wide class of domains, spaces WD) and WO(D ) consist 
of the same set of functions, and norms (188) and (145) are oe 
When J = 0 and 1 = 1, the two spaces coincide by definition, WD) 
being obviously L,(D). 


113. Properties of functions of space W® (D). A systematic study 
of the properties of functions of spaces WD) and WD) will be 
undertaken in the next few sections, devoted to so-called embedding 
theorems. The results of the present section are particular cases of 
these general theorems, but will be proved independently (by simpler 
means) in view of their importance. 

We must first note a property of WD), that follows directly from 
the definition. Suppose we have a change of variables x (a1, £o, ..-, Xn) 
to Y (Y Yz ---, Yn) such that D is mapped one-to-one on to a do- 
main D, the mapping being expressed on both sides by functions 
having continuous derivatives up to order / in the corresponding closed 
domains. If our change of independent variables is carried out for the 
functions, space WD) becomes wo D,). 

It follows at once from the definition of WD) that, if p(x) € 
€ WD), then g(x) € WED) for g < pand m < L We shall prove 
the following theorem in connection with this. 

THEOREM 1. If U is a set of elements g(x) € WD), bounded in 
WD) (l > 1), it is compact in W-D’), where D’ is any domain 
lying strictly inside D. We shall first prove the theorem for l = 1. 
There exists by hypothesis a constant C such that, if g(x) € U, then 


le llwon = Sle (az) |P + She | 2 r] dz < OP. (146) 


dTs 


113) PROPERTIES OP FUNCTIONS OF SPACE WD) 333 


We have to show that the set U is compact in L,(D’), where D’ is 
a fixed domain lying strictly inside D. The fact that U is bounded in 
L,(D’) is an immediate consequence of (146). It remains to show the 
equicontinuity in the mean of g(x) € U in L,(D’) [70]. Let us show 
that, for sufficiently small | Ax | = Vdr)? + (da)? + ... + (Arn, 
we have 

{lp (a + Ax) — ọ (a) |P da < C,| Az |?, (147) 
re 
where C, is a constant, the same for all g(x) € U, whence the equi- 
continuity follows. 

We can assume, by rotating the coordinate axes if necessary, that 
Az is (x, 0,0, ...,0) and Az, > 0. Let D” be a domain of the same 
type as D’, D’ being strictly inside D”. We shall assume Az, so small 
that x + Ax does not go outside D” when x € D’. We suppose first 
that p(x) is continuously differentiable in D. Obviously, 


R= {Ip (x + Ax) — 9 (x) P de = SIP “OPER Be «Ae de” de. 


When p > 1, we can apply Hélder’s inequality to the inner integral: 


p 4x, 
R< | (dey | f | Geen te) P ae) de = 


D’ 0 





7 a Op (a +T, Lor -> -y Lp) | P 
= (Az,) {Lf dt 





dz] dt < 


P Ax, 


< (4a) Í [lel Gon ae = (429P ||P lon. 
ò 


Hence 
flp (£+ Ax) — p (x) P dx < (Ax,)? lipig waon- (148) 
D 

When p = 1, this inequality is obtained directly by changing the 


order of integration. Inequality (148) holds for any function p(x) of 
wD ), as well as for continuously differentiable functions. This is 


334 METRIC AND NORMED SPACES {113 


easily seen by choosing a sequence of continuously differentiable 
functions convergent to p(x) in WD), and passing to the limit in 
(148). In conjunction with (146), (148) leads to (147). The theorem is 
thus proved for l = 1. Now let J = 2. On observing that say 6y(x)/dx; 
is the generalized derivative of dp(x)/dx, with respect to 2,, we can 
prove the theorem for / = 2 by applying it with ¿ = 1. The case of 
any J is similarly considered. 

A further theorem must be mentioned, which is a direct consequence 
of the theorem proved in [IV; 156]. 

THEOREM 2. If a sequence of functions g(x), continuous and having 
continuous derivatives up to order l= [n/2]+ 1, is convergent in 
WD’), where D’ is any domain lying strictly inside D, the p(x) are 
uniformly convergent in any domain D’. 

It follows at once from what has been said that the limit function 
g(x) is continuous inside D. Now suppose that we have a function 
p(x) € WD), where 1 > [n/2]+ 1, and g(x) are mean functions 
for p(x), and the averaging radius tends to zero as k — ce. On taking 
into account the property of mean functions of the generalized deriv- 
atives [109] and Theorem 2, we arrive at the proposition: if g(x) € 
€ WY(D) and l > n/2+ 1, g(x) is equivalent to a function con- 
tinuous in D. 

Now let the sequence of p(x) be convergent in W®. Let us investig- 
ate these functions on any section of the domain D. Let D be the cylinder 
defined by 0 < 2, < a (a is a finite number) and (2, Zz, ..-, En) 
belong to the closure & of some finite domain E of the (£y, 2, ..., 
Zn) plane. We shall write Zx, for the section of D by the planes 
£n = const. 

THEOREM 3. Let g(x) (k = 1, 2, ...) be continuous and have contin- 
uous derivatives Əpı(L)/ð£n in D, and let both yx) and Əpı(L)/ƏTn be 
convergent in L,(D) (p> 1). Then g(x) are uniformly convergent in 
L,(&x,) with respect to £n of [0, a], the limit function g(x) of L,(D) is 
defined on all sections E, and, as an element of L(@x,), depends con- 
tinuously on £n. We shall first prove the theorem for z, € [a/2, a}. 
We take a function f(z,), continuously differentiable in [0, a], equal 
to zero for 2, = 0 and unity for æn € [a/2,a]. It may be verified 
directly that the functions y(x) = E(£n)prlx) and dy;(x)/O%, are con- 
vergent in L,(D). 


On using the formula 
Xn 


eles E| 


e 


OWy (Ly, Los- -y Lyons T) 
ee dt (149) 


113] PROPERTIES OF FUNCTIONS OFSPACE WD) 335 
and Hélder’s inequality, we obtain 


f jp; (£) — Yr (x)|? dz... d£n- = 


č 





Xn 
Oy, (Lis <. -s n-i T) Oy (Lise -cs Ln- T P 
=j] | a Ae nate ea ae n Jar djs dtS 
E 0 





, Xn 
Ld 
2 OY, (Lis. 025 Lye T a reeeo Zn- T) IP 
<2? f Yı (2 Tm Eei elu eee dao le, pele 

éo l 

p 

z || ow Oy, ||P 

pr j| SY STk 

<a aay Gr, |La) 














Hence it follows that y,(z) are convergent uniformly with respect 
to £n, in the norm of L,(%,,), to some function (x). Since (xn) = 1 
for £, € [a/2,a], we thus obtain for the p(x) a limit function p(x) 
in L,(@,.), defined on each section Zy if £n € [a/2, a]. This assertion 
can be proved in precisely the same way for x, € [0,a/2], and it is 
obvious that the limit function g(x), defined on all the sections 2, 
will be the limit of the 9;(x) in L,(D) also. It remains to prove that p(x), 
as an element of L,(%,.), is continuously dependent on £n, i.e. to 
prove that 


lim Sleep... Mpegs En + Ô) — eens Bag En)|? dey. de, 0. 
> oa, 


We have, in view of Hélder’s inequality: 


[lee ass -+> Zamar En H Ò) — We (Err - - ey —1 Cn) P dz, . d£n = 























Xntéd 
Z if Op, Sooo a E: C A < 
fin Xn E 

p „ntô Pp 

a Oy, |P | Oy, ||P 
P eN p 

<ô Í f| By AS ar ety 
er, Xn 
and in the limit: 
five... w+) Bnp By + Ô) — P(T- -Epi En) P de. ..dx,_4 < 
p 
zil Gp ip 

OP” ||- ; 
= ITa L,(D) 














336 METRIO AND NORMED SPACES [113 


whence follows the required relationship with 2, € [a/2,a]. The 
same holds for æ, € [0, a/2]. 

Note. The condition for continuity of the derivatives 09,,(%)/02, 
can be weakened in the theorem by requiring only the existence of the 
generalized derivatives from L,(D). In this case (149) will hold [110] 
for all z, € [0, a] and almost all (£i, £z ...,%n-,) of Z, and all the 
subsequent arguments retain their force. The limit of d9(x”)/dz, in 
LD) will obviously be the generalized derivative dp(x)/Oz, in D. It 
follows from the last theorem and the properties of the generalized 
derivatives that, if g(x) is given say in the cube Q [0 <a, < l; 
i= 1,2,...,] and belongs to L,(Q) (p > 1) together with the 
generalized derivative d(x)/dzp, it is equivalent to a function defined 
on every section of the cube Q by the planes z, = b (0 < b < 1), be- 
longing to L,(%,,) on these sections, and continuously dependent on £n 
in the norm of L,(@). This follows from the fact that such a function 
g(x) can be approximated [111] by functions g(x) continuously dif- 
ferentiable in Q as indicated in the conditions of the theorem. In 
particular, boundary values of p(x) will exist in the sense indicated on 
the boundaries x, = 0 and z, = 1 of the cube. 

Now let g(x) be given some bounded domain D and belong to 
WD). Further, let the boundary of D contain a smooth (n — 1)- 
dimensional piece S. We map the part of D adjacent to S into a paral- 
lelepiped with the aid of a transformation y = y(x), continuously 
differentiable as far as the boundary (on the assumption that S is 
such that this transformation is possible). If 


n = F (£1. .-, Bp) (a < £y < Ê; k=1,2,... n — 1) 
is the equation of S, and the points (£, %, ..., Zn), satisfying a < 
< £ < B (k =1,2,..., n — 1); 0 < a — Flay, ..-,8n-4) < y, be- 


long to D, we can take as the new variables yp: 
Yı = Ti; Ya = Tas- - 5 Un = Tn — F(x,,.. +) €y—4)- 


The parallelepiped is defined bya < yp < £ (k = 1, 2, ..., n — 1); 
0 < Yn < 7. In the new variables p(y) will belong to word), where @ 
is the parallelepiped, and what has been said above will hold for it. 

In particular, on approaching the boundaries T of the cube Q, which 
is the image of the piece S, the value of p(y) will approach in the norm 
of L (T) the values of p(y) on the boundary itself. This means, in the 
old coordinates, that the values of g(x) on J and the values of ¢(x) on 
“the corresponding images of the displaced surfaces’ will be close 


113) PROPERTIES OF FUNCTIONS OF SPACE WD) 337 


to each other in the sense of the norm of L,(S). We can speak in this 
sense of the values on smooth (n — 1)-dimensional surfaces of the func- 
tions g(x) of WD) {and in particular, of their values on smooth 
pieces of (n — 1)-dimensional boundaries), and of the fact of these 
values being taken continuously. 

We now show that, given the conditions indicated below, the usual 
formula for integration by parts holds for functions of space W()(D): 


plx) zx Sa x 
D 


a x) p(%) cos PN ds, (150) 
S 


where nis the outward normal to the boundary S of the domain D. We 
assume that D can be divided into a finite number of domains Dp, 
each of which is star-shaped with respect to one of its points and has 
a piecewise smooth boundary. If we can prove that (150) holds for 
each of the D,, it can be shown to hold for the whole of D by summation 
of the equation over all the D,. So let D be star-shaped and (zx) € 
€ WED), v(x) € W®(D) (1/p + Up’ = 1). By the theorem of [111], 
there exist sequences of functions p(x) and y(x), continuously dif- 
ferentiable in D, convergent to ¢(z) in Ww (D) and y(x) in WD) 
respectively. By Theorem 3, p(x) — p(x) in L,(8), and y(x) —> y(x) 
in L,.(8). Formula (150) holds for g(x) and y,(x). On passing to the 
limit in it with respect to k, we find that (150) holds for g(x) and y(x). 

The solution of boundary value problems of mathematical physics in- 
volves a discussion of certain subspaces of WD), consisting of elements 
satisfying given homogeneous boundary conditions. They were first 
introduced by K. Friedrichs (see Hilbert—Courant, Methoden der mate- 
matischen Physik, vol. II, ch. VII). As above, let D be a bounded domain 
of space & (£i, ta, ..., Un). We shall write C\?(D) for the set of all finite 
functions, continuous and continuously differentiable up to order / in 
D. We introduce the norm of WD) into this lineal, and we write 
WD) for the result of closure of O(D) with respect to this norm. 
If g(a) € CD) and y(x) € WD) (1/p + 1/p’ = 1), we have by 
definition of the generalized derivatives: 


{ D* pla) pla) de = (— 1)* f ple) D" y(x) de. (151) 
D D 


On noticing that the elements of W(D) are the limits of elements of 
C®(D) in the norm of WD ), we can say that (151) holds for any 
g(x) € WD) and y(x) € ew? (D). Obviously, W OD) belongs to 


338 METRIO AND NORMED SPACES [113 


WOD). It may easily be seen that SYD) is a regular part of WD) 
(l > 1). For suppose we take the case l= 1. We write down the 
formula for integration by parts: 





JEE vee) de = — fox) FEEL de + fola) ya) cos (n, z) a8, 
D D S 


where g(x) and y(x) are continuously differentiable in D. There exist 
in WD) functions g(x) of the type indicated, non-zero on S, and the 
integral over S will be non-zero for these, given a suitable choice of 
p(x). For any other g(x) of WD), (151) holds with k = 1, the integral 
over the surface being absent. 

Since y(x) € WD) is arbitrary, (151) really tells us that functions 
p(x) of WD) “vanish on the boundary together with their deriva- 
tives up to order l — 1”. 

If the boundary S is sufficiently smooth, g(x) and its derivatives up 
to order (l — 1) tend to zero in the norm of L (S) on approaching 8, 
as mentioned above. 

In the general case our convergence condition only holds in the 
sense that (151) holds for any g(x) € WY(D) and y(x) € WSAD). 
When considering space WD), the smoothness of the boundary 
does not play an essential part, since, on locating D inside some sphere 
D, and extending the definition of v(x) in D, by zero (outside D), 
we get g(x) € W(D,). Such an extension of functions of W)(D) gives 
us the possibility of drawing more complete conclusions without 
introducing domains D’ lying strictly inside D. In particular: (1) 
functions p(x) of WD) can be approximated in the norm of W(D) 
by finite functions infinitely differentiable in D; (2) if p(x) € 
€ WY(D) with l > [n/2] + 1, g(x) is equivalent to a function con- 
tinuous in D and vanishing on the boundary of D. 

We conclude by showing that the closure of functions of oD) 
in WDD) (p > 1), ie. in L,(D), gives the whole of space Z,(D). 
In other words, smooth finite functions form a set dense in L,(D). 

For, let D, be the set of points not less than a distance 6 from the 
boundary of D, and suppose that, for any g(x) € L,(D): 


g® (x) =< p(z), if v E Dy 
0, if xE Dy 


The functions g(x) are obviously dense in Z,(D), since, as 6—> 0, 


ile — » IIR wy = f \p(x)|? dæ —> 0. 
D- Dò 


114] EMBEDDING THEOREMS 339 


We form the mean functions g(x) for p(x) with h < 6/2. These 
functions are smooth and finite in D, and converge to yz) in L,(D) 
as h— 0. Also, they form a set dense in L,(D). We have therefore 
shown that WD) and WD) coincide when | = 0. 


114, Embedding theorems. We now turn to a detailed discussion 
of the properties of functions of space WD) and establish the con- 
nection between the behaviour of the functions and that of their 
derivatives in the domain itself and in sections of it of different 
dimensions. A number of important inequalities will be obtained in 
this connection, and we shall discuss on the basis of these the question 
of equivalent norms in space WD). The aggregate of these results 
is generally known as Sobolev’s ‘‘embedding theorems”. We shall state 
these theorems in the present section. 

We shall assume that D is star-shaped with respect to every point 
of some sphere K lying inside D, or that D can be divided by smooth 
surfaces into a finite number of domains of this type. 

It may be shown for domains of this type that, if p(x) has all the 
generalized derivatives of order l > 1 in D, where p(x) and these derivatives 
belong to L,(D) (p > 1), then g(x) has every generalized derivative up 
to order l, and these also belong to L,(D). In other words, the class of 
functions WD) is embedded in the class W D) when m < l, as 
also in class WED). It follows from this, by the way, that classes 
WD) and XD) consist of the same functions for domains of the 
type indicated. We agree to use for brevity the notation 


lll! pat = llell ws (D): 


It may be shown that 
1-1 
D liell < 4 llelo» (152) 
k=l 


where A is a positive constant independent of the choice of p(x) € 
€ WD). Inequality (152) shows that the norms of spaces WOD) 
and WD) are equivalent, and we may not differentiate between 
these spaces in future. These assertions are contained as particular 
cases in more general theorems, to the statement of which we now 
proceed. 

The above assertions, and the theorems stated below, will be proved 
in [115—118]. 


340 METRIC AND NORMED SPACES [114 


We introduce as a preliminary the space C®(D) of functions con- 
tinuous in D and having all derivatives up to and including order J, 
these being also continuous in D, the norm in this space being defined 
as follows: 

Ilpllemm) = max |D* g(x), 
x€D 
O<k<I 


where D“9(x) is any derivative of order k and the maximum is taken 
over x € D and over all possible derivatives up to order J. We recall 
that the continuity of any given derivative D*¢(x) in D is understood 
as follows: g(x) has a continuous derivative D(z) inside D, and this 
latter can be given a supplementary definition on the boundary of D 
so as to obtain a function continuous in D. Instead of the above norm 
in C(D), we can introduce the equivalent norm in accordance with 


iell= > max 


k=O kypwetk,=k XED 


at p(x) 
dail. . aka 








Obviously, O°(D) is a complete B space. 
We shall write C(D), as earlier, for space OD). 

THEOREM 1. If p> 1 and pl > n, every function g(x) € WD) is 
equivalent to a function of C(D) and 


lelle < H |e llweroy, (153) 


where M is a constant depending only on the domain D. Every set U 
bounded in WD), is compact in C(D). 

THEOREM 2. If p > 1 and pl < n, every function g(x) € WD) is 
equivalent to a function g(x) which is defined almost everywhere on the 
section D, of domain D by any plane of s > n — pl dimensions, p(x) 
being summable on D, to any degree q satisfying 


ps 


1< n— pl 


(154) 
and where 
llellzana < M: liellwpo » (155) 


in which M, is a constant depending only on the domain D and the 
section D,. In addition, given any e > 0, there exists an n > 0 the same 
for all p(x) whose norms in WD) do not exceed a fixed number, such 
that 
f |ple + 4x) — p(z)|? ds < e, (156) 
Ds 


114] EMBEDDING THEOREMS 341 


where | At | < n and the points x and x + tAz (0 < qt < 1) belong to 
D. It follows from what has been said that a set, bounded in WD), 
is compact in L,(D,). 

Notice that we can take as D, either a complete or an incomplete 
plane s-dimensional section of D, as also an n-dimensional domain 
belonging to D. If pl = n, we can take any g greater than unity in the 
second theorem. When pl < n, the right-hand side of (154) is greater 
than unity. The theorem remains valid if the plane is replaced by a 
smooth surface. 

Note. By Theorem 1, every function (zx) € WD), given 
pl > n, is also a function of C(D), i.e. it is embedded from WD) into 
O(D). By (153), the embedding operator, associating each function of 
WD) with the same function as an element of C(D), is bounded, 
and the last assertion of the theorem amounts to the fact that this 
operator is also completely continuous. A similar remark can be made 
for Theorem 2. Let us also mention some corollaries of the embedding 
theorems. If pl > n and the integer m satisfies 0 < m < l — n/p, any 
y(t) € W(D) is continuously differentiable in D up to order m, the 
functions D* g(x) with k < m being equivalent to the corresponding 
derivatives of g(x), continuous in D, and there exists a positive 
number A, depending only on D, such that 


|D" (x)| < A |løllwpo- (157) 


Hence it follows that W9(D) with pl > n is the part of the space 
of functions continuously differentiable in D up to order 1 — [n/p] — 1 
(part of space O'~"/P1-1), If 


m> 0, m >1—— and s >n — (l — m)p, 
we have on every sufficiently smooth s-dimensional manifold F, in D: 
sp 
D” g(x) ELi (F,) for q< n— (ü mp ’ (158) 
and there exists a positive number A,, depending only on D and Fs, 


such that 
ID” 9(2)||L¢0.) < Ay llellwpo- (159) 


The embedding operators are completely continuous in all our 
examples. 

Theorems 1 and 2 enable us to construct different norms in W®(D), 
equivalent to the basic norm (145) or (138). The next Theorem 3 gives 
a more general result in this connection. 


342 METRIC AND NORMED SPACES {114 


THEOREM 3. Let 1,(u) (k = 1, 2, ..., N) be linear bounded functionals 
in WD) such that they do not vanish simultaneously on a non-identt- 
cally-zero polynomial of degree not greater than L — 1. Then 


leP= S Dele soy + Ble) (160) 


Ltht+..-thal 
defines a norm equivalent to the basic norm (144) or (138). 
Notice first of all that, by what was said in [112], norm (160) is 
equivalent say to 
N 
lol= WD elles + È lko) | (161) 
Ltht...thel k=l 


and to other analogous expressions. 
We shall quote some examples of equivalent norms in space WD). 
It follows from Theorem 3 that we can specify the norm in this space by 


ivl=2 





ase l, „oT | fo z)dz|. (162) 


For, the functional f o(a)de is linear in WD) by virtue of the 
inequality 


1 
| f ete) da] < (lp leo (DIP < Ih ivg (mD, 
and { cdx ¥ 0 for any constant ¢ # 0 (here, mD denotes the measure 
Ò 
of domain D). 


It follows from Theorem 2 that the norm 


n 
oll = 2 az l 


is equivalent to norm (138) for any q satisfying the condition 1 < q < 
< pnj(n — p) (p < n). When p = n, we can take any q > 1.Ifp >n, 
we can again take any q > 1, or even replace the second term in (163) 
by || @ Ilem. The same remark applies to the next formula (164). 
The reader may easily verify that the expression 


+ lle llea) (163) 











n 








lell = + |ie lleas) > (164) 


ae ale p(D) 
where S is any given sats (n — 1)-manifold in D, and the index q 
satisfies 1 < q < p(n — 1)/(n — p) (p < n), also defines a norm equi- 
valent to (138). Similar arguments lead to various equivalent norms 
in space WD) with 1 > 1. 


115] INTEGRAL OPERATORS WITH A POLAR KERNEL 343 


115. Integral operators with a polar kernel. We now turn to a study of the 
proofs of the embedding theorems, which we steated in the previous section. 
We wust first discuss a class of integral operators with polar kernel. As usual, 
D denotes a bounded domain of n-dimensional space Rp. Let og, be a sphere 
of unit radius in Rp (of n — 1 dimensions) and | a, | the area of this sphere. 
An elementary volume is given in R, [II; 173] by dr = dx, da, ... da, = 
= 71 drdo,, where 


do, = sin"? 6, sin”™? 6, ... sin 07-2 d0, dO, ... d0,_.d9. 
THEOREM 1. Let B(x, y) be a kernel, bounded for x and y € D and continuous 
or x Æ y. The integral operator 


Bie, y) 


u(x) = [e= yf yf fy ) dy (165) 


is completely continuous (and hence bounded) as an operator from L,(D) into 
C(D) for 4 < nf[p’(\/p + I/p’ = 1).F 
By hypothesis, | B(x, y)| < B, where B is constant. By Hélder’s inequality: 


1 1 
Ju(a) | < BI $ Iz) P dy]? L f r= dy’, 
D KR 


where Kp is the sphere of radius R with centre the origin, containing D, 


and r = |x — y |. On passing to spherical coordinates with the origin at x, 
we obtain 
1 2R 2 
[ f r dy]? < [f et ar f do)” (166) 
KR 0 On 


and consequently, 
1 


n 
= =å 
lua) < B (z) eR Ihla. (167) 


This inequality shows that, if || f ||, Ad) S C, where C is a constant, the cor- 
responding functions u(x) are uniformly bounded in D. To prove the theorem, 
it is sufficient to show that they are equicontinuous. Let 6 > 0 be a sufficiently 
small number and D(©) the set of the points y € D for which |y — z | > ô. 
We obtain for æ and x + dr € D: 


B(x, y) B(x + Az, y) 











meca] Bowe Teel “1H(y) [dy + 
+8 f Wea +e [ iwi a 
D—De p-p® |x+ dz — y |è Y» 


where we assume | Ax | < ô/2. Given any «> 0, there exists an 7 > 0 such 
that the absolute value of the difference in the first integral is < ¢if| 4r| < n. 


t Here and below we assume p > 1. 


344 METRIO AND NORMED SPACES [115 


On applying Holder’s inequality, we obtain the upper bound e | D |/?’ C for 
the first integral, where | D | is the measure of D. An upper bound can be found 
for the second integral from inequality (167) with 2R = ô, whilst the third 
integral does not exceed the integral of the corresponding integrand over the 
part of the sphere 


ly- +4) È, 


which belongs to D, and the upper bound of this integral is also found from (167) 
with 2R = 36/2. 
Finally: 
L l 2a Ri 
(u+ da) —u(e)|<elD|% 0 + B (EF [or +() | 
n — Ap’ 2 g 
whence, since our choice of ô and e is arbitrary, we see that the u(x) are equi- 
continuous for || f ||: n) < C. The theorem is thus proved. 

THEOREM 2. Let n > å> n/p’, the integer 8 >n — (n — dA)p (or what 
amounts to the same thing, s/p > À — n/p’) and «<n. The integral operator 
(165) is completely continuous as an operator from L (D) into L,(D,), where D; 
is any given s-dimensional plane section and q is any number satiafying 


SN: 2 
1515 (mp 

Assuming that B(x, y) is defined in the domain D, containing D in its interior, 
and possesses the above-mentioned properties in D,, we obtain in addition : 


|| u(x + aa) — u(x) |lrapa < ela) Ilf leew) > (168) 


where x° is a fixed vector, and e(a) ia continuous for O < a< 6, where 6 is a 
positive number equal to zero for a = 0 and defined by the constant B, as also 
by the dimensions of D and D,. When s = n we can take any subdomain of D as D,. 

Note. It need not be assumed that B(x, y) is defined in a wider domain 
D, but in this case we have to regard the displacements az? as permissible 
in (168), i.o. such that the points z + ax? are situated in D. Notice also that, 
when n — (n — À) p = 0, q is arbitrary. 

We shall divide the proof into two parts. 

Lemma 1l. Given the conditions mentioned, operator (165) is bounded as an 
operator from L,(D) into L,(D,). 

It follows from the conditions of the theorem that g* > p, and we shall 
assume for the moment that g > p. We have 











8 8 n 8 n 
=> — =1-— , ie. A=— 7—2 > 0). 
17 F p aP B (8 ) 
We put 
1 1 1 l 
= —; Q, = — — —; a= 7 a a. a) = l. 
ay q 2 p q 3 p (a, + a, + a4) 


115] INTEGRAL OPERATORS WITH A POLAR KERNEL 345 


On twice applying Hélder’s inequality, we arrive at the following inequality 
with three factors: 


fintetstar <| fiz EPLE iSi gk 
D D D R 


On applying it to the integral on the right-hand side of the obvious inequality 
Pp s 1 1 n 
Les E es T 
lua < B f {ir |(r e w? ) (e Jay, 
Kr 


where f(y) is continued by zero on the part of the sphere lying outside D, we 


obtain 
1 1 1 


Lul < B| fiyo res av] | [itm Pay]? * ie ay] A 
Ke Ke Kr 


The second factor on the right is || f Iig. The third is subject to the familiar 
inequality i 





f "B 
fersa ian, 
Kr 
whence 
> 1-2 r 
u(x) < B -zel ) (2R) |l F liepo [ J ti) prta |" 
K 
and z 
a 
Pa D Ker 


where the differential dx‘) refers to D,. We change the order of integration and 
find an upper bound for the inner integral of 7-*+4" over D,, which is taken with 
constant y. We notice first that 


n 
p’ 
On introducing into D, spherical coordinates whose centre is the projection 


of y on D, we get dx = op Idede,, and r75+1P< g75+1Ê, since g < r and 
—e8e + qf <0. Hence 


Jany dx < fee -t do do, = | os | (2R)® f 





-s+ =-9(2-8)=-a(1- +8) <0. 


gB 
D; D; 
and, by (169): 
llu ilro < (2520 || F llrao) » (170) 
where 
Ie (Lola 
o = B (Le PA" 171 
p’ B 9B (a71) 


does not depend on the dimensions of D. 


346 METRIC AND NORMED SPACES [115 


We have assumed q > p. If we take q < p, it is sufficient to use the inequality 


1-2 (172) 
Ilu Irap < He leao! Psl ” (P < qi <9") 
(| D, | is the measure of D,), which follows directly from Hölder’s inequality; 
the further factor | D, |177% appears in the expression for C. 

Lemma 2. Operator (165) is continuous with respect to a displacement in the 
sense indicated in Theorem 2. 

As in Theorem 1, 





B(x, y) B(x + Az, y) 
Aa) — u — . 
| ue + Aa) wale: | yep ly — e+ day |W) dy + 
ify) | f(y) | f 
B ee! 4 B NE d 73 
4 erate f ote day a 
ly-x] <ð ly-x1<o8 


where, as above, | Ax | < 6/2. Given e > 0, there exists an 7 > 0 such that, 
with | dz | < 7, 
1 1 
|| u(x + Aa) — ulz) llran) < El DIP | Dsl? WF leew) +--+ 


and the row of dots is the sum of the norms of the second and third terms 
on the right-hand side of (173). We have inequality (170) for these norms, with 
2R = 6 and 2R = 36/2, so that 


|| u(x + Ax) — u(x) llipa) < (C1 E + C28?) | F llr (174) 
where 
1 1 1 1 
z P q. — pg(leal\p (losl\a 3) 
omiani o= aG P Us p+ GF] wo. 


When qg < p, a further factor appears in the expression for C,. 

To prove Theorem 2, it remains to show that, given || f||,,.p) < 4, where 
A is a constant, a set of functions u(x) is obtained, compact in L,(D,). The 
u(x) are bounded in L,(D,) by virtue of (170), whilst they are equi-continuous by 
virtue of (174), if we assume that Av lies in D, and notice that the above- 
mentioned 7 > 0, which was defined according to £, depends on the kernel 
B(x, y), but not on f(y). Theorem 2 is proved. 

THEOREM 3. Let B (x, y) and B(x, y) be bounded kernels for x and y € D 
and be continuous for x Æ y. Then the integral 
Pile, y) Baly) 


P dy A <n; <n), (175) 


ten Josy y=] 


given x and z € D, can be written as 
I(æ,z) = B(æ,2) ole — z|), (176) 
where B(x, z) has the same properties as Bx, y) (¢ = 1, 2) and 
Taten) for A turn, 
o(G) =} 1+|logé| for A+u=n, (177) 
l forAty<n. 


115] INTEGRAL OPERATORS WITH A POLAR KERNEL 347 


Let x 4 z. We divide D into two parts: D = D, + D,, such that D, contains 
only the point z and D, only z. On choosing p’ > 1 such that A < n/p’, and 
taking f(y) = Bly, z) | y — z |” in operator (165), we can assert on the basis 
of Theorem 1 that integral (175), taken over D,, is a continuous function of 
(x, z). Similarly for the integral over D,. The continuity of I(x, z) for x Æ z is 
thus proved. 

To prove (176), we only need to show that 





year ig cer <le), (178) 


where C is a positive constant. 
l. Let A+ p >n. We write | x — z | = ô and introduce the new coordinates 


zZ 
ò’? 


a’ z = 


> 


2., p. 
=F yo 


so that |x’ — z’| = 1. Using the ordinary notation R, for the whole of n- 
dimensional space, we obtain 





J ô” dy’ 
a < | eater T eee jæ — y’ Al y’— z Jo" 
* 


To evaluate the integral over R„, we locate the origin at the point x’ and the 
A axis in the direction from x’ to z’. We now obtain 


Tier <ô- Aten) J dy’ 
|z—y E. ly’ Bly’ — zo le 
where z has the coordinates (1, 0, 0, ..., 0). The latter integral is convergent, 


since A<n, w<n and 4+ u >n, and is obviously independent of x, z, 
whence it follows that 





dy ~—(A+u—n) 
<Oļg—z Ute A 
eaten: | | 


2. The case à + p =n. Let Kp be the sphere with centre the origin and 
radius R, containing D; on introducing the same coordinates as above, we 
have 


Jaen ly —2|e < las Rly =F 7 





dy’ 
3 < [ ate os 
jeri — z’ je Ke Ly’ by’ — z je 
2R 


ð 


We have doubled the radius in the last integral, but can integrate as before 
over the sphere with centre the origin. The positive ô can be assumed suf- 
ficiently small. Let 2R/ô > 2. The integral over K2R is split into two: over 

é 


348 METRIC AND NORMED SPACES [115 


K, and over Kga — K:. The integration over K, gives a positive constant 
C,. It remains to consider 


n= if dy’ 
i ly’ Rly’ — zae 
KR- Ka 
oO 
Since | y’ — 2)| > |y’| — 1, we obtain, on writing | y’| =r: 
2R 2R 
EJ F 


n—i d 
asiel | apoti f p rF +u =n), 
š r 


2 


or, since r > 2, 


2R 
è 2R 
melar] È < janl?" log 2 


Finally, 


ree Spe net Os oad SCM ogles e 


(>O, C>C,). 


whence follows inequality (178) for A+ =n. 

3. The investigation of the case 4 + u < n is basically the same as above. 
Notice that integral (178) is now convergent with x = z. Precisely as in the 
previous case, we have 


oe dy’ 
<ò" a "| mr < 
Jr TETTE P ly’ Ply — z l” 
ð 
2R 


3 n—1 
d 
S i — a |< 
pate ( es =) 
7 
2 
2R 


ò 
< "72e [o + |a,| 24 f ee aa ar| < 
2 


< "h-u fo, + | on | 24 Ay] ; 


i.e. we have 


and Theorem 3 is proved. 


116] SOBOLEV’S INTEGRAL FORMS 349 


116. Sobolev’s integral forms. We shall now assume that the domain D 
is star-shaped with respect to every point of a sphere K lying inside D. We take 
the centre of this sphere as origin and let R denote the radius. We introduce 
the following infinitely differentiable function [71]: 


R 
Ce RIY forļy|<R, 


PW) = ho tor |y] >R. (179) 


where the constant C is chosen so that the integral of p(y) over K is equal to 
unity. 

Let u(y) be any given continuous function, continuously differentiable in 
D. On introducing spherical coordinates with the origin at a point vx, we can 
regard u(y) as a function of z, rand wp, where w, denotes the set of angular 
spherical coordinates [IV; 156], ie. u(y) = u(x, r, œn), where u(x, 0, wn) = 
= u(x). Let us consider the integral of u(y) p(y) over D. The integration is in 
fact confined to the sphere K. On writing p(y) in the form p(x, r, wn), and 
integrating by parts with respect to the variable r, we get 


f uly) ply) dy = J [f u(x, T, Wn) P(T, T, @,) r”? ar] don = 
D wn 0 


EVAS f | fi u(x, r, on) ar [ f pe, wn) "= do] do, = 
an 0 r 


f'm00 
= — f [zron f pi, e on orae)? down + 
Oy r 


+ fff gee [fre o, w) 0" * de] ar} dw, = 


On 


= u(a)+| [ far set EE T, Wn) (fe. 0, Op) oe”? a dr faw 


and finally 
a , 1 
fut) pm) ay = wee) |E ir Bey) dy, (180) 
D D 


where 
Biz, y) = — J p(z, e, œn) "de, (181) 
T 


and the point y corresponds to the spherical coordinates (r, œn) with the origin 
at z. The function B(x, y) is bounded, if z and y € D, and is continuous for 


350 METRIC AND NORMED SPACES [116 


x Æ y. If y — x along a straight ray, B(x, y) has the limit 


= J P(x, 0, ©n) eo"! do, 


depending on the angular coordinates of the ray. It follows at once from the 
definition of B(x, y) that, if x belongs to the sphere K, B(x, y) = 0 for y belong- 
ing to D and lying outside K, whilst if æ is outside K, B(x, y) = 0 for y belonging 
to D and lying outside the domain formed by the sphere K and the part of the 
cone with vertex x tangential to the sphere, which lies between x and the 
sphere. On using the formula 





duly) _ yr Puly) _ 5 Suey) wo 
Br Ry yr OD S 2 u Ty a)? 


we obtain by (180): 
5 [uu Bay) a 


asos A a o 
D 
where 
U= | uy) py)dys Biz, y) = Biz, y) =", (183) 
ly—2 





D 


The kernels B,(z,y) obviously have the same properties as B(x, y): they 
are bounded for x and y € D, are continuous for x + y, and vanish in the above- 
mentioned part of D. 

Now suppose that u(y) is continuous and has continuous derivatives up 
to order l in D, and let us deduce an expression for u(x) in terms of its [th 
order derivatives. Using (182), we can write 








Buty) duz) Bly, 2) 
Oy sa a 2 Je Oz, |y — z|" dey (184) 
where 
ə 
u= f a ply) dy . (185) 
D 


On substituting (184) and (182) and using Theorem 3 of [115], we obtain 





n n @u(z) Blr, z) 
et a E 0z; Ory [eap a coe! 
where 
bis = B; (x,y) . Bge) _ B,(z, y) Buy, 2) dy, (187) 





[e —y j *? pacea Ja — y |" |y — z [Pt 


116] SOBOLEV'’S INTEGRAL FORMS 351 


the kernels B; (x, z) have the same properties as B(x, y), and, by Theorem 1 
of [115] (with f(y) = 1), the b;(x) are continuous in D. The expression for 
u(x) in terms of the lth order derivatives is similarly obtained: 


n 





u(z) = U + > 2 Ut, ..., tk bi, ..., te (x) + 
1gk<i-l liisa lkl 
n al uly) Bi ulz y) 
meri dy, 188 
į E Oyj» «++ Yh Ja—y |r? y (188) 


where 6,,, ..., ip (x) are continuous in D and the kernels Bip- + «siz (@, y) have 
the same properties as B(z, y), 





oF uty) 
liye h= So “Yy? w, (189) 


and the summation over z,, ..., 7; is a summation over all the types of derivatives 
of order l, Notice that, when n < l, the kernels in (188) are bounded [115]. 

It should also be noticed that the U;,,...,, are linear functionals in L,(D). 
This can be seen by writing them in the form 


=(— 1% pu _ 
Ut, -u a = (~ D) J WY) yaa OY (190) 


We now turn to a consideration of space OXD). We recall that the norm 
in it is given by 


a 
lakaot f, a Arena (191) 


(one of the equivalent norms [112]). Let us show that integral form (188) 
holds for any function u(x) € WD). Let um(x) be a sequence of functions of 
COD), convergent to u(x) in W(D) [111]. We write down (188) for the 
Um(z) and assign form (190) to the U; p.-i, then pass to the limit as m — co. 

By Theorems 1 and 2 of [115], the integral operators with kernels Biot (2, Y) 
|x — y|~@-9 aro continuous in L,(D), so that we are able to pass to the limit 
under the integral sign. Therefore, integral form (188) is valid for any functions 
of WD). We now show that functions of WD) have all possible generalized 
derivatives of any order k < l from L D). On ‘applying (182) to the derivative 
of u,,(x) of order (l — 1), we obtain 





a) una) = ug") 


of umly) Balz, y) g 
Oxy, +- Ort, 


a + J2 Out, pen Yir, dyk | x—Yy java Ys (192) 


where 


3- ly el-1 
UP = fags du = l Hafu : opty) dy 
ity ooo, de D ey, -- ee a p (y) dy = (— 1) A m (Y) Oyj, = yü 


352 METRIC AND NORMED SPACES {117 


By Theorem 1 or 2 of [115], depending on whether n — 1 < n/p’ orn — 1 > 
> n/p’, we can say that the integral operator on the right-hand side of (192) 
is continuous as an operator from L,(D) into C(D) or L,(D), i.e. always into 
L,(D). By hypothesis, the u(x) are convergent in the norm of WD) to u(x); 
hence it follows that, as m — oo, the right-hand side of (192) has a limit in 
L,(D), so that u(x) has generalized derivatives of order (l — 1) of Z,(D) 
in D, for which (192) holds with u,,(x) replaced by u(x) and Uia by 

11 

Diy anti = (WI fu y) PY ay. 
D 

On using the integral forms for the derivatives of any order k < l in terms 
of the derivatives of order J, similar to (188) and (192), we can establish in the 
same way the existence of all possible generalized derivatives of any lower 
orders from L,(D). These integral forms can now be automatically extended 
to the whole of WD). Notice that the corresponding integral operators have 
kernels with polarity of order n — (l — k). The boundedness of these integral 

operators in L,(D) leads directly to the inequality 











aku (x) | [ | ou (x) 
area eel Aa C _ u(x) 
| Ops...) OTi <C {|| Ait i 5 l | Oay,.. - Oxi, | | (193) 
Ly(D) weu OB Ly D 


(k = 1, 2,.., L — 1). 
whence (152) follows. 

We can now turn form (189) back to (190). We have thus shown that, for 
domains star-shaped with respect to a sphere, spaces WẸ) and wp consist of the 
same set of functions, and by (193), the norma in these spaces are equivalent. 

The integral form (188) of functions of WD) was obtained in a somewhat 
different form by Sobolev (see 8. L. Sobolev, Some Applications Of Functional 
Analysis To Mathematical Physics (Nekotorye primeneniya funktsional’nogo 
analiza v matematicheskoi fizike), LGU, 1950. 

We shall next prove the general embedding theorems stated in [114] for 
domains star-shaped with respect to a sphere, and then show how the theorems 
may be extended to a wider class of domains. Our assertion regarding the equi- 
valence of W{)(D) and WOND) will thereby also be extended to a wider class 
of domains. 


117. Embedding theorems. Let us return to the integral form (188). The 
terms outside the integral on the right-hand side of (188) are completely continu- 
ous operators from WD) into O(D). For, any function u(x) € WỌXD) is 
mapped by such an operator into the same function b,,...,;,(z), continuous 
in D, multiplied by the number U,,,...,;,, representing a functional continuous 
in WEXD). If the set of functions u(x) is bounded in WEND), we can extract 
from the numbers Uj,,...;,, & convergent sequence. The corresponding se- 
quence of functions Uipeoi bipeoul®) is uniformly convergent in D. The- 
orems l or 2 of [115] are applicable to the integral terms of (188). We thus 
obtain at once the following two embedding theorems. 


+ In future, we need not differentiate between WEND) and WẸXD). 


117] EMREDDING THEOREMS 353 


_ Tueorem 1. If pl >n, every function u(x) € WD) is continuous in 
D (WUD) c C(D)) and the embedding operator is bounded: 
ma |u (x)| = |lu llo < K || u wo) (K > 0 is constant) 
and completely continuous, i.e. it maps every set of functions, bounded in 
WD), into a compact set in C(D). 
THEOREM 2. Let pl <n and 8 >n — pl. Then functions u(x) of WEND) 
belong to L,(D,) on any plane s-dimensional section D,, for any 


ps 

gS ip (194) 
The embedding operator from WD) into L,(D,) is bounded and completely 
continuous. As an element of L,(D,), u(x) ie continuous in the metric of L{D,) 
with respect to a parallel displacement of the section D,, if the latter is permissible. 

Note 1. Functions u(x) of Wi)(D) are defined apart from equivalent 
functions, and the statement of the theorem regarding the behaviour of u(x) 
on sections refers to a certain choice of the class of equivalents ‘cf. 113}. Notice 
that, as follows from the above, (188) defines precisely this type of function. 

2. If we replace L,(D,) in Theorem 2 by L,,(D,), it can be shown that the 
theorem remains valid up to and including the word “bounded”. 

3. If we take an s-manifold T, in D (it may lie on the boundary of D), which 
may be mapped (at any rate piece by piece) into a plane with the aid of 
an l-times continuously differentiable and uniquely reversible change of vari- 
ables y; = YZ, Zr +--+, 2n) (€ = 1, ..., n), Theorem 2 remains valid when 
D, is replaced by T, It is required here that the change of variables be defined 
in some n-dimensional neighbourhood of T, 

We have already remarked that the generalized derivatives of order m < l 
of functions u(x) € W{)(D) are expressed by formulae analogous to (188), in 
terms of the derivatives of order } with the aid of integral operators with 
polarity of the order n — (l — m). On applying the theorems of [115], the fol- 
lowing theorems are obtained as above: 

THEOREM 3. If pl >n and 0<m <1 — n/p, the generalized derivatives of 
order m of functions u(x) € WD) are continuous in D, and the embedding 
operator from WD) into OXD) is bounded and completely continuous. 

THEOREM 4. If m > 1 — n/p and 8 >n — (l — m) p, the generalized deriva- 
tives of order m of functions u(x) € WD) belong to L,(D,), on s-dimensional 
plane sections D, of the domain D for any 


g<g= n—(l—m)p’ (195) 


where 


(a) || D" || t(D) S E ihe ll yo 


wo (196) 


(b) for any set bounded in WEND) of functions u(x), the set D,, u(x) is compact 
in L(D,); 

(c) the functiona D,, u(x) in the metric of L,(D,) are continuous with respect 
o a permissible parallel displacement of D,. 


354 METRIC AND NORMED SPACES [117 


The same remarks can be made on Theorem 4 as on Theorem 2. 

We now turn to the proof of Theorem 3 of [114] regarding equivalent norms 
in WD). Let us recall the proposition. 

Let the linear functionals l,(u) (k = 1, 2, ..., N), bounded in WYND), be such 
that they do not vanish simultaneously on a non-identically-zero polynomial of 
degree not higher than l — 1. Then the norm in WD), defined by 


au “ann eis 


u E. sa ar 
Iu Il 2 al ie 


is equivalent to norm (191). 

Proor. Obviously, norm (197) has an upper bound expressible in terms of 
norm (191), since the functionals J,(u) (k = 1, 2, ..., N) are bounded in the 
latter norm. 

We must prove the reverse inequality. For the moment, let us simply write 
|] u || for norm (197) and || u llw®py for norm (191), equivalent to norm (145). 

We have to show that 





N 
de + 5 |l (uP, (197) 
k=] 


I Ihy oy <4 lel (198) 


for all functions of W{)(D). 

We assume the opposite, i.e. that there exists an infinite sequence of positive 
numbers Am (m = 1, 2,...) and elements m(x) of WD), such that Am > 
-++ -- co as m-— oo, and 


Il tm lly oy > A m|\| mll (199) 
By introducing constant factors into the u(x), we can assume that 
Ie yy %¢py = 2 (200) 


It follows from (199) and (200) that || um || — 0 as m — 0, so that all the 
generalized derivatives of order } of the u,,(x) are convergent to zero in L,(D). 
By (200) and Theorem 2, the sequence u,,(x) is compact in L,(D). We extract 
from it a convergent subsequence, which we again denote by u,,(x), and let 
Up(x) + u(x) in L,(D) as m — oo. It now follows from Theorem 2 [109] that 
all the generalized derivatives of order J of u(x) exist and are zero. Notice also 
that the u,,(x) are convergent to u(x) in the sense of norm (191). We now show 
that u(x) = 0. We take a strictly interior subdomain D’ of D. Given suffici- 
ently small h, the derivatives of the mean functions u),(x) coincide in D’ [109] 
with the mean functions of the derivatives D! u,(x). It follows at once from this 
that all the derivatives of order ¿ of the mean functions w,,(x) are zero in D’, 
and hence u(x) are polynomials in x,, ..., Z, of degree not higher than J — 1. 
Since the set of such polynomials forms a subspace in L,(D’), and uy,(z) + u(x) 
in L,(D), we see that u,(x) is a polynomial of degree not higher than J — 1 
in any interior subdomain, i.e. everywhere in D. We now observe that, since 
the functionals 1,(u) are continuous in W YD), lk(um) > (uo) a8 m— co 
(k = 1,2, ..., N). At the same time, it follows from (199) and (200) that 
(ttm) — 0 a8 m — co (k = 1,2, ..., N). We thus find that 1,(u.) =0 (k = 
=1,2,...,N), and u(x) = 0 by hypothesis of the theorem. But this last 


118] DOMAINS OF A MORE GENERAL TYPE 355 


contradicts (200) and the convergence of (x) to u(x) in WD). The theorem 
is proved. Some simple examples will now be given of the use of the above 
theorems. 

(1) Let u(x) € WEND), i.e. u(x), together with its generalized derivatives 
up to and including order 2, is square summable over D. Given n < 3, it fol- 
lows from Theorem 1 that u(x) is continuous in D. This assertion may be false 
with n > 4. 

(2) Norm (162) in WỌXD), that we obtained with the aid of Theorem 3 in 
[114], leads with p = 2 to the familiar Poincaré inequality: 


fu way <B | f 2 (Ge you +(few a) (201) 
D prn 


D 


(3) Similarly, norm (164) gives with p = 2 and q = 2: 


fe (y) dy <C [fÈ (S dy oe (y) a (202) 


D 


Here S is a sufficiently smooth (n — 1)-dimensional manifold in D. In parti- 
cular, if the boundary of D is piecewise smooth, it can be taken as the manifold 
S in (202). In this latter case (202) is the familiar Friedrichs inequality. 


118. Domains of a more general type. We must now consider how to carry 
over the embedding theorems to a wider class of domains. Let the bounded 
domain D be divisible by piecewise smooth (n — 1)-manifolds into a finite number 
of domains, in each of which everything proved in [117] ie valid. Then it is also 
valid in D. Obviously it is sufficient to consider the case when D is divided 
by a surface S, into two non-intersecting parts D, and D,, Let u(x) € WUD). 
We must show first of all that u(x) has in D all generalized derivatives of lower 
orders of L (D). Obviously, u(x) € WYK), where K is any sphere lying in 
D. On applying what was said in [116] to this sphere, it will be seen that u(x) 
has in K all possible generalized derivatives of the form y(x) = D* u(x) (0 < 
< k <l) belonging to L,(K). The function x(x) is defined everywhere in D 
and belongs to Lp(D). For, u(x) has in D, and D, a derivative D* u(x) from 
L,(D,) and L,(D,) respectively. Since the generalized derivative is unique, we 
can say that y(x) coincides with D* u(x) in D, and D,, so that x(x) € Lp(D). 
We now show that y(x) is the generalized derivative D* u(x) in D. Let D’ be 
an arbitrary strictly interior subdomain of D and 6 > 0 the distance of D’ 
from the boundary of D. Let « be an arbitrary point of D’ and 0 < h < 6. 
The function u(y) has the generalized derivative D* u(y) = y(y) in the sphere 
of radius ô with centre at the point x. On forming the means with averaging 
radius h, we can say [109] that D* u,(x) = y,(x) at the centre of the sphere. 
Since u,(z) +> u(x) as h— 0O and D* u,(x) +> x(x) in LD’), by the second 
definition of generalized derivatives [109], y(x) is the generalized derivative 
D* u(x) of u(x) in D. It remains to recall that x(x) € Lp(D). 

Note. Our discussion shows that functions of ED) have in any domain 
D generalized derivatives of all lower orders of L (D’). 


356 METRIC AND NORMED SPACES {119 


We now consider the question of carrying over Theorem 1 of [117] to our 
domain D. Suppose that n < pl, so that u(x) is continuous in D, and D,. We 
must show that u(x) is continuous in D; for this, it is sufficient to show that 
u(x) is continuous at points of the surface S,. The continuity of u(x) at any 
interior point of D follows from the theorem for embedding Ww? into C, applied 
to a sufficiently small sphere. As regards points of S, lying on the boundary 
of D, since u(x) is continuous in D,, the boundary values of u(x) taken over 
any path lying in D, are the same as the boundary values obtained along a 
path in S,. The same can be said of the boundary values on approaching from 
D,. Hence it follows that the boundary values of u(x) are the same over any 
path, and u(x) is continuous in D. The complete continuity (and hence the 
boundedness) of the embedding operator follows from the fact that, given a set 
bounded in WY(D), we can first extract a sequence convergent in C(D,), then 
extract from this a subsequence convergent in C(D,). This subsequence is obvi- 
ously uniformly convergent in D. 

It can similarly be shown that Theorem 3 of [117] can be carried over to the 
case in question. No difficulty arises in the extension of Theorem 2 [117]. 
We only have to observe that an s-dimensional section D, in general also splits 
into two parts: D, = Di, + D$, where D; = D, + D,, Dj = D, + D} On applying 
Theorem 2 of [117] to D, and D,, we obtain 


Hele (oy < lle llep + le Icon <C e lwg + Ie llwe wy] < 


< 20 |lu Ile (py. (203) 


This establishes that the embedding operator from W{)(D) into L,(D,) is 
bounded. We can similarly extend our statement about the strong continuity 
in L,(D,) of functions of W4)(D) as regards a parallel displacement of the sec- 
tion D,. The complete continuity of the operator of embedding from WYND) 
into [,(D,) follows at once from (203) and the strong continuity with respect 
to a displacement. 

No new arguments are required for the extension of Theorem 4 of [117]. 

Theorem 3 of [114], regarding equivalent norms in WED), also remains 
valid, since the proof given in [117] was based solely on the embedding theorems. 

We can now say that, if each of the partial domains composing D is star- 
shaped with respect to a sphere, all the embedding theorems discussed in 
[114] and [117] are valid for D. 


119. Space 0D. Let D be a finite or infinite domain of space 
Rna, and OD) the set of all finite functions g(x) continuous in D 
and having continuous derivatives up to order J in D. Obviously, 
C(D) [114] is a linear space. As in the case of CD) [114], we 
introduce the following norm into it: 

ilp licam = max | D® g (x) |. (204) 


x€D 
OKKI 


119} SPACE (Dy 357 


The closure of O(D) with respect to this norm leads to a B space 
which we shall write as CD). The elements of this space are bounded 
functions, continuously differentiable in D up to order J, where the 
functions themselves and their derivatives vanish on the boundary 
of D. 

Spaces (D) with different l = 0,1, ... are naturally embedded 
in each other: 0D) c f(D) if L > 4, the set of elements of 
former space being dense in the latter. This is easily proved by employ- 
ing an averaging process. 

Let us consider the space conjugate to ((D), which we denote 
by U(D) (the space of linear functionals for C(D)). It is easily 
seen that U“)(D) c UD) for h > l 

We can take as examples of elements of U(D) the functionals 
defined with the aid of a kernel summed over D by the equation 


(m, p) = | y(x) p (z) da. (205) 
D 


Their norm satisfies the inequality 


m|| < f |p (2)|dz. 
D 


Such functionals are often described as being of the function type, 
and are identified with the kernels defining them. The remaining 
functionals are described as generalized functions. Functionals of the 
function type do not exhaust the whole of U®(D). For instance, the 
functional ô(x — x,), defined by 


(6 (% — Xp), p) = P (£o), (206) 


where x, is a fixed point of D, cannot be written in form (205) with 
a kernel summable over D. The kernel corresponding to this functional 
is said to be a delta function ô(x — £o), concentrated at the point £o, 
and we write 


§ ò (@ — zo) p (x) dz = @ (%). (207) 
D 


But 6(% — zo) is not a function in the ordinary sense. As we shall 
show, however, the elements of U"(D), expressible in form (205) 
with a piecewise continuous kernel, are dense in UD). To be more 
precise: 

THEOREM. The functionals of the function type with a piecewise 
continuous kernel are dense in OD) in the sense of weak convergence 
of the functionals. Let us prove a preliminary lemma: 


358 METRIC AND NORMED SPACES {119 


Lemma. Given e > 0, there exists for any fixed element g(x) € G(D) 
a domain D, lying strictly inside D such that 
max |DMg|<e. (208) 
x€D—De 
O<k<I 
By the definition of G(D), there exists a sequence m(x) of 
elements of (“(D) which tends to g(x) in the norm (204), i.e. there 
exists a subscript m such that, given any p > 0, 


|| Pme — Pme+p llep) < £. 
The function pm.(x) differs from zero only in a domain D, of the 
type considered above, so that it follows from the last inequality 
that, for x € D — D,: 


and we get (208) in the limit as p— oo, 

We turn to the proof of our theorem. Let m € UD). We take 
the averaging kernel w,(|z — y |). Obviously, given an x belonging 
to a bounded domain D,, lying strictly inside D and at a distance of 
not less than 2ọ from the boundary of D, w,(|x — y |), regarded as 
a function of y, belongs to O(D), so that the function 


x) = (m, œ (| — y|)) (209) 


is defined for x € D, In view of the infinite differentiability of 
o,(|z — y |) and the continuity of the functional m, it will also have 
continuous derivatives of all orders with respect to x. We now define 
for all x € D the piecewise continuous function 
fit, (2) = ge (x) for zE Dy 
0 for x€D—D,, 


and take the sequence g = Q, Cz ---, tending to zero, the D, being 
assumed chosen in such a way that ‘De Cc Dora , and the D, tend in 
the limit to D. We shall show that the functionals m, defined by 
(av P) | fa (2) p (2) de = f ma (2) p (2) das, 
& 
tend to our functional m. For this, we take an arbitrary function 
g(a) € GD) and show that the difference 


Ty, = (M, p) — (ep P) = (m, p) — f (m, we, (|£ — y|)) p(x) dx (210) 
Doy 


119} SPACE tp) 359 


tends to zero as k-» œ, whence the theorem will follow. By our 
lemma, given e > 0, we can find a bounded domain D,, strictly 
inside D, such that (208) is satisfied for g(x). We shall assume k 
so large that D, c D,,. We write 


x (y) = § P(x) me, (|£ — y|) da. 
D 


ek 
The Riemann sums for this integral: 


T(y) = Yp (E) wa (lEs — Y|) 45% 


where A, x is the measure of a subdomain and &, belongs to the sub- 
domain, are uniformly convergent in y, along with their derivatives 
with respect to y up to order 1, to the function p(y) and its corres- 
ponding derivatives on indefinite subdivision by the 4, x, due to the 
continuity of p(x) and the infinite differentiability of w,(| £ — y |) 
with respect to y. But, since the functional is distributive and con- 
tinuous in norm (204), we have 


D (m, p (E) wa (Es — y|) 42) = (m, X p (Es) Oo (lEs — YI) 42), 


s 


and in the limit 


J (m, g (©) Oe (|£ — y|)) da = (m, f p(x) wa (|£ — y |) dz), 


Qk Dy 


so that expression (210) for 7, can be written as 


re = (m, p(y) — J P(t) wa (|x — y|) da). (211) 


Qk 


The function p(x) is not the usual average of o(z) with kernel 
Oall £ — y |), since the domain of integration D, cannot contain 
the whole of the sphere | v — y | < gx if y € D. But if y belongs to 
some strictly interior bounded subdomain of D, the sphere | x — y | < 
< ox will belong to D,, for all sufficiently large k, and for such y, 
p(y) is the average of g(x) with kernel w,,(| x — y |). It is clear that 
gly) € CO°(D). Let us show that p(y) are convergent to g(y) in 
norm (204). For this, we take 6 > 0 less than the distance from D, 
to the boundary of D, and write D, for the domain obtained by 
associating with D, all the spheres of radius ô with centres in D,. 


360 METRIC AND NORMED SPACES [119 
For y € D, and all sufficiently large k, Ds c D,, and 
MY) = | p(z)val|z—y]|)de. 
|x- y| Nee 

The p(y) are therefore convergent, uniformly with respect to 
y € Ds, to p(y) as kK œ, and the same for their derivatives up to 
order } with respect to y [71]. If y € D — D,, both g(y) and p(y) 
are uniformly smal] together with their derivatives up to order J, 
since (208) holds for p(y) when y € D — D,. Thus p(y) are convergent 
to ø(y) in norm (204), and it follows from (211) that r —> 0 as k —> œ. 
The theorem is proved. 

We could have shown that functionals of the function type with 
smooth kernels are also dense in U“(D). 

Let us now define the operation of multiplication by a function 
a(x) and the operation of differentiation 0/02,, for elements of U(D). 

Let a(x) € OD). If D is an infinite domain, we assume that a(z) 
and its derivatives are bounded. We introduce the linear operator 


Ag = a(x) p (x), 
which maps g(x) € C(D) into a(x) g(x) € 6D), where s = min (l, 7). 
The operation of multiplying an element m of UD) by a(z) is 
defined as the operator A* conjugate to A, i.e. we define it by the 
equation 
(m, Ap) = (4* m, p), (212) 

which has to be satisfied for all g(x) € C®(D). 

We can now apply the operator A* to an element of U"(D) and 
A*m € U'"(D). If the functional m has the form (205) with kernel 
summable over D, we have 


(m, Ap) = ee a) [a (x) needa lel) (x)] p (x) dx, 
i.e. A*m is also a functional of the function type with kernel a(z) p(x) 
We now consider the differentiation operator 
—_ (x) 
Bọ = — eg . 
It is a bounded operator from Ẹ”(D) (l> 1) into CD). The 
conjugate operator B* is also defined by an equation of form (212): 
(m, By) = (B* m, p) (213) 
and maps an element m of U“~"(D) into U(D). This equation 


119] SPACE Gp) 361 


shows that, for a functional m expressible in form (205) with con- 
tinuously differentiable kernel y(x), the kernel dy(x)/dx, corresponds 
to the functional B*m, since 








j2 
(m, By) = — Joe oe dz = ae g (x) dz = (B* m, p). 
D 
Let us consider how to evaluate operators A* and B* of a functional 
dependent on a 6-function, i.e. on functional (206). 
Let y(x) € G\(D) (l > 1). Then 


(6 (x — 2), Ay) = (A* ô (x — 2), p) = a (£0) P (Xo) 
(214) 


(ô (x — 21), Bp) = (B* ò (x — z0), p) = — a. xen 


We can introduce successive differentiation, where the result is 
easily shown to be independent of the order of differentiation. We can 
hence define derivatives up to order / and various differential operators 
for elements of U'(D). The same problems can be posed for these 
operators as for ordinary functions; viz. Cauchy’s problem and various 
boundary value problems. Generalized functions were first introduced by 
Sobolev when solving Cauchy’s problem for linear hyperbolic equations 
(1936). This extension of the class of entities proves useful from two 
points of view: firstly, it may happen that a problem has no solution 
in the class of ordinary functions, but has a solution in the class 
of generalized functions (functionals). Secondly, it is sometimes 
easier to prove the existence of a “poor” solution, represented by a 
generalized function, and then to investigate the question as to 
whether this generalized function is an ordinary one. 

Both these circumstances become clear in the example of Cauchy’s 
problem for various systems of partial differential equations with 
constant coefficients: 





Bu (æ, t) N Z Ou (ad 
Ot = 2h Oar, + Sayu j (x, t) + fi ( T, t). (215) 
(x, 0) = g; (x) 


On applying Fourier’s transformation with respect to x, this prob- 
lem reduces to the Cauchy problem for a system of ordinary differen- 
tial equations with constant coefficients, depending on the numerical 
parameters a,. Application of the inverse Fourier transformation 


362 METRIO AND NORMED SPACES [119 


enables us to pass from the solutions of these ordinary equations to 
the solutions of the initial problem. If we remain within the framework 
of classical Fourier transformations, we are confined to considering 
only initial functions and terms f; decreasing in a definite sense. 
The subject was investigated by Petrovskii with this assumption. 
But it was shown from supplementary arguments in this author’s 
works that there exists a class of so-called hyperbolic systems, for 
which the solution at an arbitrary point of space (£o to) is defined by 
the values of the initial functions g(x) only in some bounded part 
of space z, depending on the point (£o to). It was thereby established 
that problem (215) for hyperbolic systems is uniquely soluble for 
any behaviour of p(x) with indefinite increase of | z |. 

Further, it has been shown in works of Tikhonov, Ladyzhenskaya 
and Edelmann that the initial functions for so-called parabolic 
systems can be taken as not merely non-decreasing as |v |— ©, 
but even as indefinitely (exponentially) increasing. But definite 
restrictions have to be imposed in this case on the order of growth 
in order to preserve the uniqueness theorems. Finally, as regards 
the inverse Cauchy problem for the heat conduction equation (i.e. 
the Cauchy problem for the equation du/dt = Au, solved down- 
wards in ¢), it was known to have an ordinary solution only with 
special initial data. 

All these facts required a more careful study of problem (215) 
and in connection with this, of the Fourier transformations of functions 
with arbitrary behaviour at infinity. This study was initiated by 
Schwartz and performed in detail by Gel’fand and Shilov. The Fourier 
transform of a function increasing at infinity is in general no longer 
a function, but a functional in some 0(R,,). It is a matter of further 
considering the Cauchy problem for systems of ordinary differential 
equations in the class of these functionals, then passing from these 
to the solutions of problem (215), which prove in certain cases to be 
ordinary, and in others, generalized functions. We shall not give 
the results here of these investigations, as carried out by Gel’fand, 
Shilov and their pupils, but refer the reader to the works of these 
authors on generalized functions and their applications.t 


However, we shall mention here some facts regarding the solutions in func- 
tionals of linear second order equations of the elliptic, parabolic and hyperbolic 


+ Three parts of a substantial work by Gel’fand and Shilov on generalized 
functions and their applications have just appeared. 


119] SPACE GD) 363 


types with variable but smooth coefficients. We take one of these equations: 
L (u) = f (2), (216) 


in which f(x) is a function having a singularity at x = x, For elliptic and para- 
bolic equations, all the solutions of (216) prove to be ordinary functions, smooth 
everywhere excepting possibly at 2,, where they may have a singularity. 
Taking the example of Laplace’s operator, we shall show below that, even if 
f(x) is a delta-function, concentrated at the point x, the solutions of (216) 
are still ordinary functions, having only a polarity at x If equations (216) 
are taken to be homogeneous, all the solutions are ordinary functions. 

This is not the case for hyperbolic equations. For these, the singularity of 
J (x) is extended over entire domains, and the solution may prove to be a gener- 
alized and not an ordinary function. An example may be given. We take the 
wave equation wy = Ux, x, + Ux, x, F Ux,x,, One of its solutions is given by 
Poisson’s formula [II; 71]: 


2n n 


u (æ, N = | f oe +en) sino anag. (217) 
00 


We know that, given a thrice continuously differentiable g(x), this formula 
yields a twice continuously differentiable solution of the wave equation, satisfy- 
ing the initial conditions u(x, 0) = 0; u(x, 0) = p(x). Let pm(x) be thrice contin- 
uously differentiable non-negative functions, tending to the function (x) == 
= 1/(x? + 2%) as m — oo. It is clear from (217) that the corresponding solu- 
tions u,,(z, t) will tend to +20 for (x,t) lying in the domain Vz + a3} < t of 
the half-space ¢ > 0. This tells us that no solution in the form of an ordinary 
function exists for the wave equation corresponding to the initial conditions 
u(x, 0) = 0, u, = (x, 0) = 1/(x] + 23). Nevertheless the solution is unique in 
the class of functionals. Similarly, by using Kirchhoff’s formula, it can be 
shown that the non-homogeneous wave equation with right-hand side f(x) 
equal to d(z — 24) also has no solutions in the class of ordinary functions, but 
has them in the class of functionals. 

We propose to consider all these questions in more detail in a later volume. 
As already mentioned, Sobolev’s works on the Cauchy problem for hyperbolic 
equations were the first to pose and solve problems in the class of generalized 
functions. 

We shall conclude by proving the above assertions regarding the solutions 
of Laplace’s equation. 

Let D be a bounded three-dimensional domain and | x — zx, | = r the distance 
from the variable point x to x). Assuming the boundary of D to be sufficiently 
smooth, we obtain for p(x) € Ĉi (D) by applying Green’s formula: 

1 Ap (x) 
glen a o dx 

D 

or 


|(- age) sp @ de = 5-2) 90), 
D 


364 METRIO AND NORMED SPACES [119 


whence it is clear, on the basis of the definition of the derivative of a functional, 
that the functional m, with kernel (—1/4zr) satisfies the non-homogeneous 
Laplace equation: 


am, om 
a oat ae e = ò (a — a). (218) 











We have thereby shown that one of the solutions of (218) is a functional 
of the function type. To obtain all the solutions of (218), we only need to find all 
the solutions in functionals of the gis Laplace equation: 


om 02m 


am = aa + ote + ay = 0: (219) 





We must show that all the functionals m, satisfying this equation, are 
expressible in terms of a kernel, and that these kernels are functions harmonic 
in D. Equation (219) is equivalent to 


(m, 4,9) = 0 (220) 
for any function (x) € 6D). We take the following function as g(x): 
1 [ (+=) (4) 
z) = ———_—_ —— | — y| -———_ ]], 221 
r= Galea PU a e Sei 


where y(¢) is a non-negative infinitely differentiable function, equal to 1 for 
£ € [0, 1/2] and zero for > 1. If the point y lies inside D, p(x) € GD) 
for sufficiently small ọ, and g,. The function 


0 forz=y 
sLaetesar (a) & Ie-nl> 0, 


equal to zero for |x — y | < 0/2 and for |x — y | > ọ, can be taker as the 
averaging kernel, since 


ma 


Ix-yl Se 
-fa [zr o]as- f ¥ ler (7) |ar = ; 
and “ 
w (le =y) == x( S41), 
where 


g? 2 9 1 
= (se ta r)a Ol 
On substituting function (221) for p(x) in (220), we obtain 


0 = (m, oy (l£ —y |) — (lz —y|))- (222) 
This can be rewritten in accordance with notation (209) as 


Mo, (Y) = Mos (Y) 


119] SPACE XD) 365 


for any y whose distance from the boundary of D is greater than max (Q,, @,). 
We take the set V, of all g(x) of GD), equal to zero outside some domain 
Ds; at a distance greater than 2 max (¢,, 02) from the boundary of D. We have 
for such g(x): 


J p(y) ma, (y) dy = (op P) = J o (y) mg, (y) dy = (tig, P)- 
Da Dè 


On the other hand, it hes been shown that (Mo, p) — (m, p) as o, > 0. 
Consequently, for all ọp < 6/2 and the g(x) taken by us: 
(m, p) = (fep P) = J p(y) Mer (y) dy (k = 1, 2), 
Ds 

i.e. the functional m is specified by the kernel 

Fy (2) 0 for r€ D— Do., 

qv) = 
mMer Mep (x) for æ € Deg. 


Let us show that m,,(x) is a harmonic function in D. In fact, it follows from 
(220) that, for g(x) € V5: 


0 = (m, 4,9) = J my (2) 4, p (x) da = f p(x) Ayma (x) dz, 
Dè Da 


and, since V, is dense in L,(D,), we have for x € Dz: 
AxM: (£) = 0. 


Since the number ô> 0 was taken arbitrarily, and since m, (x) = m4(z) 
for 6’ < 6 and x € Ds, we can say that the family of harmonic functions m(x) 
defines a function m(x) harmonic in D, coinciding with m(x) for x € Da. This 
harmonic function in fact generates the functional m required, since, if we 
take an arbitrary p(x) of GD), it is equal to zero outside some domain D, 
s0 that we have for it, in view of our discussion: 


(m, p) = § m(x) p (x) dz = | m (z) g (x) dz. (223) 
Dò D 


The behaviour of m(x) as x approaches the boundary of D is defined by the 
fact that integral (223) must be convergent for any p(x) of &O(D). 

A corollary must be mentioned of the result obtained. Suppose that y(x) is 
a function summable in D (D is a finite domain) and 


J v(x) 4,9 (x) dz =0 
D 
for any g(x) of (2D). 
The functional 
(m, p) = f p(x) p (z) dx 
D 


satisfies equation (219), and by what has been said, we can assert that y(x) 
is equivalent to a function harmonic in D. 


366 METRIO AND NORMED SPACES [119 


A similar discussion can be given of linear functionals on various 
families of functions. We shall mention an example. Let K be the 
family of real functions p(x), defined throughout space Rn, finite and 
having continuous derivatives of all orders. The family K is a linear 
space. It cannot be normed in the ordinary sense of the word, and 
we introduce the following definition for this space only: 

DEFINITION. We shall say that the sequence g(x) (k = 1, 2, ...) of 
functions of K tends to zero if there exists a bounded domain outside 
which all the y(x) are zero, and if the y(x) and all their derivatives tend 
uniformly to zero as k —> œ. 

The functional (m, g) is defined in K by associating the real number 
(m, mp) with each g(x) € K. Such a functional is said to be linear 
(or linear and continuous), if it is distributive, i.e. (m, c, pı + cy pa) = 
= ¢,(m, pı) + olm, p2), and has the property that (m, p) > 0 as the 
sequence p(x) tends to zero. Functionals of the function type are 
defined by (205), where D is R, and y(x) is any given function, 
summable in any bounded domain. Multiplication of the functional 
by the function (zx), having continuous derivatives of all orders, 
is defined by the equation (œ m, p) = (m, wp) and differentiation of 
a functional by 


(D*m, p) = (— 1)* (m, D* g). 
The functional has derivatives of all orders. The theory of functionals 


in space K is discussed in the above-mentioned work by Gel’fand and 
Shilov. 


CHAPTER V 


HILBERT SPACE 


§ 1. The theory of bounded operators 


120, Axioms of the space. When discussing function space L, and 
sequence space l, we explained their identity of structure; they are 
realizations of the same abstract space, to the investigation of which 
the present chapter is devoted. It was first introduced by Hilbert 
in the form l, and is generally known as H or Hilbert space. As we 
shall see below, H space is a particular case of a B space, and everything 
said about these latter applies to H space. But apart from all this, 
H space has its own special properties. 

Let us enumerate the axioms defining H. H space is a linear space, 
the elements of which satisfy axiom A of [95]. We shall assume that 
elements of H can be multiplied by complex numbers (a complex 
linear space). 

We shall further assume that, if there is no proviso to the contrary, 
given any positive integer n, there exist n linearly independent 
elements (axiom B of [95]). We now introduce a new axiom, relating 
to scalar products: 

Axiom C. Each pair of elements x and y of H has associated with 
it a definite complex number, which is called the scalar product of x 
and y. It is denoted by (x,y). This scalar product has the following 
properties: 


(y, x) = (x, y); (a! + x”, y) = (x', y) + (2", Y); | a) 
(x, x) >0, if x40; (ax, y)=a (g, y). 


We recall that 2 4 0 means that x is not the zero element. The 
following are immediate corollaries of the above properties: 


(z, y +y”) = (x, y’) + (x, y”); (x, ay) = A(z, y); 
(x, x) =0, if z=0; (x, y)=0, if x or y=0. 
387 


368 HILBERT SPACE [120 


The expression \ (x, 2), in which the value of the root is assumed > 0, 
is called the norm of element x and is denoted by || x ||, as in [95]. 
We now show that the norm thus defined satisfies the conditions 
of axiom C of [95]. It follows at once from the definition that || x || > 0, 
where the = sign only holds for the zero element. We have further: 





|| ax ||? = (ax, ax) = |a | (x, x) =|a/?-|/al?, i. e. 
laz||=ļa| æ]. (3) 
Hence it follows that || —2 || = || x || [95]. 
It remains to verify that 
l+ yl < liell + Iyl- (4) 
We must first show that 
Læ |< Melly (5) 


which will in future be called Buniakowski’s inequality [cf. IV; 35]. 
Let x and y be any elements of H, and a and b any complex numbers. 
We have 


ax + by ||? = (ax + by, ax + by) = aū (x, x) + ab (x, y) + 
+ T (y, x) + bb (y, y) > 0. 


This is a Hermitian positive form (with respect to variables a and 
b), whose discriminant must be non-negative, i.e. 


(x, x) (Y, y) — (z, y) (yx) >O or æl- diy ll? — | (x, y) |? >o, 


whence (5) follows. We turn to the proof of (4). Using the obvious 


equation 
(x, y) + (Y, z) = 2R (x, y) (6) 


where @ is the notation for the real part, we get 
e+ yle = (£ +y, x+y) = llle iyl? 2R (z, y), 
whence it follows, in view of | Q(x, y) | < | (x, y) | and inequality (5): 
le + yl? < lel + ily P+ lel yl = del yl 


and we arrive at (4). 
As in [95], it follows from (4) that 


7 le— yl > lel lly 
le— yli <le — zll + lz- yll- (7) 


121] ORTHOGONALITY AND ORTHOGONAL SYSTEMS OF ELEMENTS 369 


The concept of norm leads, as in [95], to the concept of the distance 
between elements x and y: g(x, y) = || x — y ||, and to the concept 
of the limit of a sequence £p (strong convergence): 


Ea => Lo if || x — zp || —> 0. (8) 


Everything said earlier about the limit is still valid. We have seen 
[95] that, if an —> ap, En => To and Yn => Yo, then an Tn => Ao To and 
Ln + Yn => Lo + Yo. We shall now prove a theorem. 

THEOREM. If x, => £o and Yn => Yo, then (Ln, Yn) > (Lo, Yo). We put 
Un = Lp, — Land Vn = Yn — Yo By hypothesis, || un || and || vn || > 
—> 0. We have 


(£o Yo) — (Ln Yn) = (Los Yo) — (Lo + Un: Yo F Yn) = 
= — (Lo Vn) — (Un Yo) — (Un Pn) 
and we can write, on applying (5): 
| (£o, Yo) — (En Yn) | < Zo li | enll + Itn I yolh + fl ta I Ll enll 


whence it follows, since || un || and || Un || > 0 that (£n, Yn) > (£o Yo). 

This gives us, when tp = Yn: if £n => £o then || Xm ||— || zoll 
[ef. 95]. 

If the sequence zn has a limiting element, it is mutually conver- 
gent, ie. || % — £m || — 0 as n and m— œ [95]. We shall assume 
that H space is complete. 

Axiom D. If a sequence x, is mutually convergent, there exists an 
element x, of H such that x, = x, In addition, we shall take the 
following axiom: 

Axiom E. H space is separable. In other words, there exists a 
denumerable set of elements of H which is dense in H. 

It follows at once from what has been said that H is a B space. 


121. Orthogonality and orthogonal systems of elements. If (x, y) = 
= 0, then (y, z) = 0, by (1), and the elements x and y are now said 
to be mutually orthogonal, or simply orthogonal, and we write 
x | y. By (2), the zero element is orthogonal to any element. 

Let 2, Ta ..-,%m be mutually orthogonal elements, i.e. (xp, £4) = 
= 0 for p#q. Let us take the square of the norm of the sum of 
these elements: 


[ay + ay ee. pull? = (@ + ty +. + Em Fy + Hy t---+ Em) 


370 HILBERT SPAOE [121 


On expanding the scalar product in accordance with (1) and (2) 
and using the orthogonality, we obtain the following Pythagoras’ 
theorem for mutually orthogonal elements: 


[tit te +E ml? = || eal + [ee iP +--+ em? (9) 


As in [95], we can use the concept of limit to bring in the idea 
of the convergence of an infinite series formed from elements of H: 


Uy + Ug + Ug + --- (10) 

Such a series is called convergent if the sum of its first n terms: 

Sn = U, + uw +... + Un tends to a limit: 8s, => u as n—-»> œ. The 

element u is called the sum of series (10). The necessary and sufficient 

condition for the convergence of series (10) follows at once from the 

axiom of completeness and what we have said regarding mutual 
convergence: given any e > 0, there exists an N such that 


|| Unti F Unga F- -+ HUntpll <E for n>N andp>il.  (10,) 


This convergence condition has a particularly simple form when 
the terms of series (10) are orthogonal to each other, i.e. (up, Ug) = 0 
for p # q. 
THEOREM. If the terms of series (10) are orthogonal to each other, 
the necessary and sufficient condition for its convergence is the con- 
vergence of the following series of non-negative numbers: 


È lul. (11) 


For, by Pythagoras’ theorem, condition (10,) can be written in 
this case as 


|| tnt I? F || tnta l? +- s -+ || n+p Il? <é for n > N and p > 1, 
and this latter condition is necessary and sufficient for the convergence 
of series (11). 

Whatever the rearrangement of the terms of series (10), the terms 
of series (11) undergo the same rearrangement. But this has no effect 
on its convergence. Consequently, rearrangement of the terms of 
series (10) does not affect its convergence — if it is convergent, it 
remains so after rearrangement; if it is not convergent, it cannot 
become so after rearrangement. By using Pythagoras’ theorem and 
the convergence of series (11), it is easily shown that the sum of the 
series does not depend on the order of the terms. 

We say that a sequence of elements 


Hy, Ly, Baso (12) 


121] ORTHOGONALITY AND ORTHOGONAL SYSTEMS OF ELEMENTS 371 


forms an orthogonal normalized (orthonormal) system if 


0 for p#q, | (13) 


Zp 2) = 
aa ee 


In view of the above theorem, we can assert that the necessary and 
sufficient condition for convergence of 


D UX (14) 
k=l 
is that the following series of non-negative numbers be convergent: 


|a|. (15) 


Me 


k 


ll 
a 


Suppose that this condition is satisfied, and let x denote the sum 
of series (14). We form the scalar product 


n 
(2 QkE, Lp - 
k=1 


By (13), it is equal to ap when n > p, so that we obtain on passing 
to the limit as n — co: 





ap = (2, £p). (16) 


The numbers a, defined by this formula are called the Fourier 
coefficients of the element x with respect to system (12), whilst 
series (14) is the Fourier series of the element x. We obviously have 


n n 
[2 — 2 reiese le P, (17) 


and on indefinite increase of n, we get the closure equation 
ei? = Flat- (18) 
k=1 


It follows from the above discussion that, if series (14) is convergent, 
it is the Fourier series for its own sum v, and the closure equation 
(18) holds. Now suppose conversely that some element x of H is 
given. Let us form its Fourier coefficients (16) and write formula (17). 
This leads us to Bessel’s inequality 


S lal < |z|. (19) 


372 HILBERT SPACE [121 


The series on the left is necessarily convergent, i.e. the Fourier 
series of any element 2 is necessarily convergent. If the = sign holds 
in (19), it means, by (17), that the sum of the Fourier series of the 
element x is precisely equal to this element x. System (12) is said 
to be closed if the = sign holds in (19) for any element z of H. System 
(12) is said to be complete if there exists no element of H, apart 
from zero, which is orthogonal to all the zx. It can be shown, precisely 
as in [58], that closure and completeness are equivalent. If system 
(12) is closed, every element v of H is uniquely expressible as a con- 
vergent series (14), i.e. as its Fourier series. Let a, be the Fourier 
coefficients of the element x and bx those of element y. If system (12) 
is closed, we obtain a generalized closure equation as in [58]: 


(æ, y) = X by (18,) 
k=1 


Notice also that, if c are any complex numbers and a, are the 
Fourier coefficients of an element x, the formula holds [cf. 58]: 


n n n 
e — yoe = lel — Slat Slee — a}. 
k=1 k=1 k=l 


A comparison with (17) shows us that the left-hand side of the 
last formula takes its least value when the c; are the Fourier coefficients 
of x. 

Notice that, if H space is taken in the concrete form of function 
space L, convergence in H will correspond to convergence in the 
mean in L, which we discussed in [56]. Convergence of the Fourier 
series here leads to the closure equation of [58]. 

Let us now recall an orthogonalization process that we have already 
employed in the case of n-dimensional complex space [II]; 29]. 
Let 

Zis Žas Zqy--- (20) 


be an infinite sequence of non-zero elements of H. 

We form the normalized element 2, = z / || 2, ||. Let 2, be the first 
of the elements of (20) after z, that cannot be written in the form 
Qt. We form the element yY, = 2, — (25,2) %,, which certainly 
differs from zero, then we normalize it, i.e. form 2, = Yz) || Ya |l. Further, 
let z; be the first of elements (20) after 2, that cannot be written 
in the form a, z, + a, x, We form the element 


Ys = Zi — (Zp T1) Ly — (Zp Xo) Tes 


121} ORTHOGONALITY AND ORTHOGONAL SYSTEMS OF ELEMENTS 373 


which certainly differs from zero, then normalize it, i.e. form 
£3 = Y3/|| Y3 ||. By proceeding in this way, we get the orthogonal 
and normalized system (12), which has the following property: every 
element vp is a finite linear combination of elements (20), and vice 
versa, an element z, being expressible in terms of only the first k 
elements Ys. Notice that the mutually orthogonal, non-zero elements 
Ys (s = 1, 2,...,m) are linearly independent. For, suppose we had 


iYi + CY + --- + CmYm = 9. 


On multiplying both sides by y; and taking the orthogonality into 
account, we get cp || yx ||? = 0, ie. ck = 0 (k = 1, 2, ..., m), whence 
follows the linear independence of the y,. 

Since H is separable, there exists a denumerable set M of elements 


Uy; Uy, Us, aE l (21) 


dense in H. If we orthogonalize the sequence up, we get a complete 
(closed) orthonormal system x, (k= 1,2,...), consisting of a 
denumerable set of elements. The closure of the system follows at 
once from the fact that set (21) is everywhere dense in H. If only 
a finite number of elements remained after orthogonalization, H 
would be finite-dimensional. 

Conversely, if there exists in H a complete orthonormal system 
x, (k = 1, 2,...), consisting of a denumerable set of elements, it is 
easily shown that the finite sums c; 2% + ¢,% +... +c, 2, with 
complex rational coefficients Cs (Cs = a, + bsi, where a, and b, are 
real rational numbers) form a denumerable set, dense in H, i.e. the 
separability of H is equivalent to the existence in H of a complete 
orthonormal system, representing a denumerable set of elements. 

Let us prove a further consequence of the separability of H, viz. 
every orthonormal system (v) consists of a finite or denumerable 
set of elements. 

Let x and y be two mutually orthogonal and normalized elements, i.e. 
(x,y) =0 and |[z||=||y||=1. We have ||z— yl? = (z— y, 
z —y)=2 or ||z—y||= )2, ie. the distance between two orthogonal 
and normalized elements is equal to 2. Now suppose that g (v) is a set 
of orthonormal elements. We fix e so that 0 < e< 1//2. Given 
any v of &(v), there exists an element uw, of set (21), dense in H, such 
that || ug — v || < £. On the other hand, if k is fixed, only one element 
of (v) can satisfy || u,—wv|| < e, since if there were two distinct 
elements v; and v, satisfying this inequality, we should have by the 


374 HILBERT SPACE [122 


triangle rule: || v, — v, || < 2e < V2, whilst we musthave || v, — v ||= 
= V2. It follows at once from what has been said that the set Z (v) 
is finite or denumerable. 


122, Projections. The concepts of lineal and subspace [95] will play 
an essential part in what follows. 

A set of elements belonging to any fixed subspace L satisfies all 
the axioms enumerated above, excepting possibly axiom B, since 
the subspace may be finite-dimensional. Thus every infinite-dimen- 
sional subspace can be regarded as an independent complex Hilbert 
space. This remark is quite obvious as regards all the axioms excepting 
the axiom of separability. We have to prove the following as regards 
this latter: if H is separable, every subspace L is a separable Hilbert 
space. The proof of this presents no difficulty [94]. 

Two subspaces L and M are said to be mutually orthogonal if 
any element of L is orthogonal to any element of M. We write this 
as L | M. An element v is said to be orthogonal to a subspace L 
if it is orthogonal to any element of L. We write x 1. L. Let us now 
prove a theorem, vital for what follows. 

THEOREM. If L is a subspace, any element x of H can be written as 


E=Y+Z, (22) 


where y € L and z | L. Form (22) is unique. 

If x€ L, we get form (22) by putting z= x + 0. Now suppose 
that x does not belong to L. Let d be the strict lower bound of the 
set of positive numbers || xz — y ||?, for y belonging to subspace L: 


d= inf ||z— y |l. (23) 

yEL 
There exists a sequence of elements yn belonging to L such that 
(£ — Yn € — Yn) = || £ — Yn ||? = dn > d. (24) 


Let u be any element of L. Since L is a subspace, given any real 
(or even complex) £, the elements yn + cu belong to L, and, in view 
of (23), we can write (x — Yn — cu, © — Yn — eu) > d. On expand- 
ing the scalar product, we get 


(u, U) € — 2R (x — Yn ube + (dp — d) > 0. 


The quadratic form on the left is non-negative for any real €, šo 
that we must have 


|R (u, £ — Yn) | < Vd, — d |||]. (25) 


122] PROJECTIONS 375 


We shall strengthen this inequality. Let p be the amplitude of 


the complex number (u,  — yp), i.e. (u, £ — Yn) = | (u, © — Yn) | e”, 
We replace the element u by uwe-"? in (25), where the latter element 
also belongs to L. On observing that || ue- || = || u || and that 


(we, t— Yn) = e7? (u, z — Yn) = | (u, zt — Yn) | ? 
we obtain from (25) the stricter inequality 
| (u, £ — Yn) | < Vd, — d ju j. (26) 


Remember that the v in this inequality is a given element of H, 
Yn is a sequence of elements of L satisfying condition (24), and u 
is any element of L. Let us now find an upper bound of she scalar 
product (u, Yn — Ym). On writing Yn — Ym = (Yn — T) + (£ — Ym), 
and using (26), we get 


| (U, Yn — Ym) | < | (u, £ — Yn) | + | (4, £ — Ym) | < 


< (VZ + Vin) e. 
On putting u = Yn — Ym in this inequality and cancelling || Yn — 
— Ym || in both sides, we arrive at 


Yn — Ym || < Vda, — d + Vd, — d. 

Notice that this inequality is obvious if || Yn — Ym || = 0. By (23), 
its right-hand side tends to zero on indefinite increase of m and n, 
so that the sequence of elements yn is mutually convergent. By the 
axiom of completeness, there exists an element y such that y,=y, 
and y € L, since L is a subspace. On the other hand, on passing to 
the limit in (26), we obtain for any element u of L: (u, 2 — y) = 0, 
i.e. x — y is orthogonal to L. On putting x — y = z, we in fact get 
(22), in which y € L and z | L. It remains to prove that the form 
(22) obtained is unique. Let there be two forms: 


way tea +3 


where y and y, € L and z and 2, | L. We obviously have y — y, = 
= 2, — 2. The left-hand side of this equation is an element of L, 
and the right-hand side an element orthogonal to L. Hence it follows 
that (Y — Yr Y — Yy) = 0, ie. || y — y, || = 0, so that y, = y, ie. 
2, = 2. The theorem is fully proved. 

The element y of L in (22) is called the projection of the element 
x on the subspace L. A set of elements orthogonal to a subspace L 
obviously forms a subspace. We write this as M. By the above 


376 HILBERT SPACE [123 


theorem, each element x of L can be expressed uniquely as the sum 
of two elements, one of which belongs to L, and the other to M. The 
set of elements orthogonal to M forms the subspace L. The relation- 
ship between subspaces L and M is mutual in this respect, and two 
such subspaces are described as complementary. In the case of real 
three-dimensional space, e.g. the sets of vectors forming the XY 
plane and the Z axis are complementary spaces. 
We usually write in the above case: 


H=L@M (27) 
or 
L=HOM: M=HOL, (28) 
so that HOM is the subspace of elements of H orthogonal to sub- 
space M. 


123. Linear functionals. We have had the definition of linear 
functional (x) in a B space and hence in H. We shall assume that 
it is defined throughout H. We recall that its norm, which we shall 
denote by n;, is defined by 

n, = sup |1 (2) | (29) 
Wx|[=1 
and 
[2 (x) | < mja |. (30) 

Let us give an example of a linear functional. Let y be a fixed 

element of H. We put 


I(x) = (x, y). (31) 
It is distributive by virtue of (1), and bounded by virtue of (5): 
[t| < yli ie (32) 


Notice that the = sign holds in this inequality when x = y, i.e. 
we cannot replace || y || by a smaller factor, and || y || is the norm 
of functional (31). If y is the zero element, then (x, y) = 0 for any g 
(the annihilation functional). 

As a matter of fact, (31) gives all the possible functionals in Z, 
i.e. the following important theorem holds: 

THEOREM. Hvery linear functional U(x) is uniquely expressible by (29), 
where y is a fixed element of H. 

Since the functional is distributive, we have 1(8) = 0, where 0 
is the zero element of H. Let L be the set of all elements x for which 
U(x) = 0. Since U(x) is distributive and continuous, L is a subspace. 


123] LINEAR FUNCTIONALS 377 


It may happen that L is the complete space H, i.e. that l(z) = 
for any element x. We can obviously write such a functional in the 
form I(x) = (x, 0). We now take the general case, when the subspace 
L is part of H. Let z be a fixed element of H not belonging to L. 
We can write it as z = u + v, where u € L and v | L, and v4 0 
[122]. Since v does not belong to L, we have: l(v) # 0. Let x be any 
element of H. We form the element w = æ — [l(x)/l(v)v and con- 
sider (w): 


Llw) = 1 (x) — FET (v) = U(x) — 1 £) = 


We see from this that the element w = x — [I(x){I(v)] v belongs to L, 
whilst v | L as seen above. We can therefore write 


or, on expanding the scalar product: 
(x, v) — 72 lole =0, 
whence it follows that we can write I(x) as the scalar product 


Le) = (e fife 0) = Goan where y= fi v 
It remains to show that the scalar product form of U(x) is unique. 
Let Ux) = (x, y) = (x, Yı). Hence (xz, y — yı) = 0 for any x of H. 
On putting x = y — y, we get || y — yı || = 0, i.e. y, = y, and the 
proof is complete. 
The functional defined above is sometimes called a linear functional 
of the first kind. A linear functional of the second kind is then defined 
as a bounded functional with the property: 


lı (c81 + Caa +... + Cmm) = 
= 6,1, (21) + Eal (£3) +... + Emly (Em) 
ie. constant factors are transformed into the conjugate complex 
numbers when taken outside the functional sign. An example of a 
linear functional of the second kind is provided by the scalar product 
in which the variable element v is in the second place, and the fixed 
element y in the first: 


l (x) = (y, 2). (31) 


If L(x) is a linear functional of the second kind, U(x) = 1,(z) is a 
linear functional of the first kind. It follows at once from this remark 


378 HILBERT SPACE [124 


and the theorem that (31,) gives the general form of a linear functional 
of the second kind. 

It also follows from the theorem that every linear functional I(x) 
is completely defined by the element y of H, i.e. the space H*, con- 
jugate to H, is H. We recall further that, if U(x) is a distributive 
bounded functional U(x) on a lineal L,, everywhere dense in H, it 
can be uniquely extended on to the whole of H in such a way that 
it is linear (bounded) on H with the same norm as in L, [97]. 


124. Linear operators. We now consider linear (bounded) operators, 
defined throughout H, the values of which also belong to H. In future, 
unless there is a special proviso, we shall use the terms ‘linear 
functional” and “linear operator” for distributive bounded functionals 
and operators, given throughout H [97,98]. We shall write || A || 
or na for the norm of the operator A. We recall the formula 


| 4 |] = ana Fop Az |]. (33) 


We shall use Æ to denote the operator of the identity transformation, 
i.e. Hx = x for any x € H (|| E || = 1). If || A || =0, A is the annihi- 
lation operator, ie. Ax = 0 for any 2 € H. Let L be a subspace. 
By the theorem of [122], given any x € H, we have the unique form 
xz =y +2, where y€ L and z 1 L. The operator transforming x 
into y is called the projection operator or projector into L, and is 
denoted by 

y = Pz. (34) 


If L is the whole of H, P, = E. If L consists of a single zero element, 
P, is the annihilation operator. In the general case || Pez || < || 2 ll, 
where the = sign holds when and only when g € L. If P, is not the 
annihilation operator, || Pz || = 1. Pz is distributive: for, if we have 
two expansions: 2, = y, + 2,and x, = y, + Z, where y, and y, € L, 
whilst z, and z, L L, then 2, + æ, = (Yy, + Y2) + (2, + 2%), where 
Yi + Y € L, and z, + 2% L L, i.e. Pra, + x) = Ppa, + Pr %,. Simil- 
arly, P,(ax) = aP_(2). 

Let us now introduce some new concepts and discuss the elementary 
properties of linear operators [cf. 97]. We shall often omit the word 
“linear”. 

If A and B are two operators such that Ax = Bz for any element z, 
we say that A and B coincide and write A = B. If the distributive 
and bounded operator A is given on some lineal L,, everywhere 


124] LINEAR OPERATORS 379 


dense in H, it can be uniquely extended to the whole of H whilst 
remaining distributive and bounded, just as in the case of a functional, 
whilst its norm remains not greater than the original norm in Jy. 

If A and B are two operators, and a, b two complex numbers, 
aA + bB is a linear operator, defined by 


(aA + bB)x =aAx + OBz. (35) 
On taking into account that 
|| Aa + bBa|| < |a|- || Axl] + | b|-|] Bel] < (Jalna + !b lns) || zll, 


we see that the norm of ad + bB is < |a |na + |6| ng. Operators 
can thus be multiplied by complex numbers and added. This operation 
is subject to the usual laws of algebra. Successive application of 
operators A and B is again a linear operator, which we write symbol- 
ically as B.A. Application of the same operators, but in the opposite 
order, leads to a linear operator AB, which in general differs from BA. 
We call BA and AB the products of operators A and B. This definition 
can be extended immediately to the case of any finite number of 
factors. If AB = BA, the operators are said to commute. We have 


|| BAx|| < mpl] Ar|| < nana || z ||, 


so that the norms of BA and AB are < ngna. Notice also that, 
if a is a complex number and A an operator, the norm of aA is pre- 
cisely equal to |a@|n,. A product of operators is obviously subject 
to the associative law, i.e. 


C (BA) = (CB) A, 
and to the distributive law: 
(á + B)C = AC + BC and C(A+ B)=CA+CB. 


We now introduce conjugate (adjoint) operators. Let A be a linear 
operator; we consider the scalar product (Az, y). Given any fixed 
element y, it is a functional of x. It is distributive, since the scalar 
product is distributive, and it is bounded by virtue of the obvious 
formula 

| (Ax, y) | < na lly Illl. 


But every functional can be uniquely expressed as a scalar product. 
Thus, given any fixed element y, there exists a definite element y* 
such that 

(Az, y) = (x, y*) (36) 


380 HILBERT SPACE [124 


for any x of H. This formula therefore gives a definite law by which 
a definite element y* corresponds to each element y. We write this 
law as y* = A* y, where A* is the symbol of some operator. It is 
distributive because the scalar products (Ax, y) and (2, y*) are 
distributive with respect to the second argument. Further, A* is 
bounded, as will be shown below. This operator A* is called the con- 
jugate to A. We can now write (36) as 


(Az, y) = (z, Fg y). (37) 


The following formulae for the conjugate to a sum and product 
of operators are immediate consequences of the definition: 


(aA)* = GA*; (A + B)* = A* + B*; (4B)* = B*A*; (A*)* = A. (38 
) 


Let us prove say the third formula. On twice applying definition 
(37), we obtain 
(ABz, y) = (Bx, A* y) = (x, B*A*y), 


whence it follows that (AB)* = B* A*. Let us also prove the last of 
formulae (38). Using definition (37) and the property (u, v) = (v, u), 
we have 


(A* x,y) = (Y, A* x) = (Ay, x) = (x, Ay), 


whence it follows that (A*)* = A. Finally, let us show that <A* is 
bounded. 

THEOREM 1. The norm of the conjugate operator is equal to the norm 
of the original operator, i.e. nas = na. 

On putting v = A* y in (37) and using inequalities (5) and (33), 
we obtain 


\|A* yl? = | (4 (4* y) y) |< A(A* a) I yl <email A* yl ly! 


whence it follows that || A* y || < na || y ||, so that na» < na. Since 
(A*)* = A, we have, by what has been proved: na < na», i.e. na = 
= Na. 
An operator A is said to be self-conjugate if A* = A. It is therefore 
characteristic of a self-conjugate operator that 
(Ax, y) = (a, Ay). (39) 


If we put y = v in this equation and note that (Az, x) = (x, Az), 
it will be seen that (Az, x) is real for any element x in the case of a 
self-conjugate operator. The converse is also true. 

THEOREM 2. The necessary and sufficient condition for A to be seif- 
conjugate is that (Az, x) be real for any element x. 


125] BILINEAR AND QUADRATIC FUNCTIONALS 381 


We have proved the necessity. Now suppose that (Az, z) is real 
for any choice of x, and let us show that A is self-conjugate. We have 
by hypothesis: 


(A(z +y) s+ y) = (x+y A(z + y)) 
and (A (x + iy), x + ty) = (x + ty, A (x + iy)). 


On expanding the scalar products and noticing that (Az, x) = 
= (x, Ax) and (Ay, y) = (y, Ay), we obtain 


(Ay, x) + (Ax, y) = (y, Ax) + (2, Ay); 
(Ay, x) — (Ax, y) = (Y, Ax) — (x, Ay). 


Term by term subtraction gives us (39), whence it follows that A 
is self-conjugate. On recalling (38), we see that any linear combination 
a,A, + 4,4, + ... + G@mAm of self-conjugate operators A; with real 
coefficients is a self-conjugate operator, whilst the product AB of 
self-conjugate operators is self-conjugate when and only when A and 
B commute. 
Let L be a subspace and M its complement. We have by the 
theorem of [122]: 
E = P, + Pu. (40) 


It may easily be seen that every projector P, is a self-conjugate 
operator. For, by the orthogonality of L and M, and (40), we have 


(Pix, y) = (Pix, Pry + Puy) = (Pre, Piy)= 
= (Pie + Puy, Pry) = (x, Pry). 


Let A be any linear operator. We form the two operators: 
1 1 
A, =z (4+ A*); A, = 57 (A— A*). (41) 


It may be seen, in view of (38), that A, and A, are self-conjugate. 
We thus have the following expression for any linear operator in 
terms of self-conjugate operators: A = A, + i4, 


125. Bilinear and quadratic functionals. We now show that it is 
possible to define any linear operator with the aid of a special type 
of functional. A bilinear functional is a definite law by which any 
pair of elements x and y of H is associated with a definite complex 
number I(x, y), where I(x, y) is distributive with respect to the first 


382 HILBERT SPACE [125 


argument as a functional of the first kind and with respect to the 
second argument as a functional of the second kind: 


l (axı + bx, y) =al (x y) + bl (x, Y); 
L(x, ay, + by) = al (x, y,) + l (£, ya). (42) 


We assume in addition that a bilinear functional is bounded, 
i.e. we assume that there exists a positive number N such that, 
given any elements x and y of H, we have 


[y| <N iey i (43) 


The least value of N in this inequality — the norm of the bilinear 
functional n; — is defined by 


n, = sup |l (x, y) |. (44) 
x[[=1 
ly i]=t 
If A is any linear operator, the formula 
U(x, y) = (Az, y) (45) 


may easily be seen to yield a bilinear functional. Here, 


[E(w y) |< nalle lyi 


so that nı < na for the bilinear functional (45). 

We now show that (45) gives all possible bilinear functionals. 

Tarore. Every bilinear functional is uniquely expressible by (45), 
where A is a linear operator, and the norm of the bilinear functional ni 
is equal to the norm of the operator na. 

If we fix v, U(x, y) is a functional of the second kind in y, and we 
can write [123]: I(x, y) = (z, y), where z is uniquely defined if we 
fix x, ie. z = Az, where A is an operator defined throughout H. 
Its distributiveness is a direct consequence of (42) and the distribut- 
iveness of (z, y) with respect to z. Let us show that A is bounded. 
Using (43) with N = n, we can write 


| (Aw, y)| < mi zl lly. 


On putting y = Ax and cancelling both sides of the inequality 
obtained by || Az ||, we get || Az || <n, || || (if || Av || = 0, the 
last inequality is obvious). Hence it follows that A is bounded and 
that na < nı. But we know from above that n; < na, so that n; = na. 

It remains to prove that form (45) is unique. Let 


L(x, y) = (Ax, y) = (41 z, y). 


126] BOUNDS OF A SELF-OONJUGATE OPERATOR 383 


It follows from this that (Az — A,x, y) = 0 for any x and y. 
On putting y = Ax — Az in this equation, we get || Ax — A,z || = 0, 
ie. we have Ax = Az for any 2, i.e. operators A and A, coincide, 
which completes the proof. It follows from this theorem that specify- 
ing a linear operator is equivalent to specifying a bilinear functional. 
Similarly, in algebra, specifying the elements a;, of a matrix is equi- 
valent to specifying the bilinear form 


n 
D AiktkYi 
i k=l 
Every bilinear functional I(x, y} generates a corresponding quadratic 
functional (quadratic form), if we put y = œv in it: 


1 (x, x) = (Ag, x). 


A bilinear functional is readily expressed in terms of the quadratic 
form generated by it, i.e. it can easily be shown that 


(Aw, y) = [(Ax,, £1) — (Ax, £)] + i [(Ax, £3) — (Ady, %)], (46) 
where 


z, =3 (+9); z, = > (1—9); 
(47) 


l . 1 z 
ta = z (+y); £= z (7 — iy). 


Four quadratic functionals appear on the right-hand side of (46). 
The fact that the quadratic functional (Az, x)is real for any element x 
is a characteristic feature of a self-conjugate operator, as we have 
seen. 

Suppose that A has the property that (Az, x) = 0 for any element 
x. It now follows from (46) that (Az, y) = 0 for any x and y. But the 
bilinear functional (Az, y) obviously has this property if A is the 
annihilation operator, and we can say in view of the uniqueness 
indicated in the theorem that, if (Az, xz) = 0 for any x, A is the 
annihilation operator. It follows at once from this that,if A and Bare 
such that (Az, x) = (Bz, x) for any v, then A = B. 


126. Bounds of a self-conjugate operator. Let A be a self-conjugate 
operator. Using (5) and (33), we can write | (Az, 2) | < na || x |/?, 
or, if we take || x || = 1, we get | (Az, x) | < na. Hence, if we take 
all possible normalized elements v, i.e. such that ||x|| = 1, the set of 


384 HILBERT SPACE [126 


real numbers (Az, x) is bounded from below and above. Let m denote 
the strict lower bound and M the strict upper bound of this set: 


m = inf (Ax, x); M = sup (Az, x). (48) 

xI|=1 \}x]|=21 
The numbers m and M are usually termed the bounds of the self- 
conjugate operator. We can write, by definition of the strict bounds: 


m< (Azz) <M for |ja||/=1. (49) 


Constant 1vctors can be taken outside the scalar product, so that 
we can write for an element x with any norm: 


ma? < (4x, x) < M |æ ||. (50) 


As a matter of fact, the norm of the operator nm, is very simply 
expressed in terms of its bounds m and M, in accordance with the 
following theorem: 

THEOREM 1. The norm na is equal to the greater of the two numbers 
|m] and | M |. 

The proof is a word-for-word repetition of the proof of Theorem 3 
of [IV; 36, 39], in which it is shown that na = sup | (Az, x) |, 


||x|J=2 
which is the same as the statement of the theorem. Notice also that 


Theorem 2 of [IV; 36] is the same as the assertion that n; = na, 
which was proved in the previous section. 

Let us introduce some new concepts. 

DEFINITION. A self-conjugate operator A is said to be positive 
(negative) if the corresponding quadratic functional (Ax, x) > 0 (< 0). It 
is characteristic of a positive operator that m > 0, i.e. that its lower 
bound is non-negative. We can also talk of a self-conjugate operator A 
being greater than a self-conjugate operator B, and write A > B, 
if A and B do not coincide and the difference A — B is a positive 
operator. A negative operator is similarly defined. It may be recalled 
that, in the case of n-dimensional space, a self-conjugate matrix is 
called positive if the corresponding Hermitian form 


n3 


Din, Ly (Aki = Tip) 
1 


= 
1 


l, 
takes only non-negative values. To say that a matrix is positive is 
equivalent to saying that it has no negative eigenvalues. If (Az, x) 
changes sign for different x, the self-conjugate operator A can 
evidently not be called either positive or negative. In the case of 


127] THE INVERSE OPERATOR 385 


finite-dimensional space, these A are the self-conjugate matrices 
whose eigenvalues have different signs. 

THEOREM 2. If A is a self-conjugate operator, A? is a positive operator. 
If A is any linear operator, the operators A A* and A*A are self-conjugate 
and positive. 

The first assertion follows at once from 


(A? x, x) = (Az, Ax) = || Ax ||? > 0, 
and the second from 
(AA* x, x) = (A* x, A* x) = || 4* x ||? > 0, (51) 
(A* Ax, x) = (Ax, Ax) = || Az |? > 0. (52) 


127. The inverse operator. An important concept in the theory of 
operators is that of the inverse operator (cf. the concept of inverse 
matrix in IIJ,). Various definitions can be given of the inverse operator, 
and will be described in the present section. 

As in the previous section, we shall here describe a distributive, 
bounded operator, given throughout H, as linear. 

DEFINITION. A linear operator A is said to have a bounded inverse B 
if B is a bounded operator defined throughout H, and 


AB=BA=E, (53) 


where Æ is the operator of the identity transformation. The fact 
that B is bounded is described by the usual inequality || Ba || < 
< N || z ||. It may easily be seen that there can only be one bounded 
inverse operator. For, if we have AC = E, on multiplying by B on 
the left, using (53) and recalling that BE = B and EC =C, we get 
B =C. The operator B defined above is usually written as A-}, 

and we have 
AAI = A! A =E. (54) 

Suppose we have 

y= Au (x€H). (55) 


Since A-1 is defined throughout H, we can apply A- to both 


sides and obtain 
z= Ay. (56) 


It is evident from this that, if A has a bounded inverse A~!, A 
performs a one-to-one mapping of space H into itself, i.e. a de- 
finite element y corresponds, in accordance with (55), to any element 


386 HILBERT SPACE [127 


x € H, and conversely, a definite element z, given by (56), corresponds 
to any element y € H. Similarly, A-+ maps H one-to-one into 
itself. Since A is distributive, A—! must be distributive, i.e. A-} is a 
linear operator. It follows at once from (54) that 


(A-)-1 = A. (57) 


A more general definition of inverse operator can be given. We 
notice first of all that, since a linear operator is distributive, the set 
of elements y given by (55) is a lineal, which we denote by R(A). 
Let us now consider the property that A must have in order for the 
correspondence between elements v of H and elements y of R(A) to 
be one-to-one. By (55), a definite element y of R(A) corresponds to 
any x of H. We have to show that, conversely, a definite element 
x of H corresponds to any element y of R(A). Let 2, and x, be two 
distinct elements of H, and y,, Yz the corresponding elements of R(A): 


Yı = AX; Yy= Az. 
Subtraction gives 
Yz — Yı = A (T, — %). 


If we had y, = y, i.e. the same element of R(4) corresponded to 
distinct elements 2, and x, of H, we should have A(x, — x,) = 0, 


i.e. the equation 
Az =0 (58) 


must have a solution different from the zero element. Conversely, 
if (58) has a non-zero solution £o, the same element y = 0 corresponds 
to distinct elements «= 2, and «= 0. Thus the necessary and 
sufficient condition for (55) to give a one-to-one correspondence 
between elements x of H and elements y of #(A) is that (58) have 
only a zero solution. An inverse B to operator A is now defined on 
the lineal R(A). It transforms an element y of R(A) into an element 
x of H such that y is expressed in terms of x by (55). We shall call 
this operator simply the inverse, as distinct from the bounded inverse 
which we defined above. The operator B is defined only on the lineal 
(A), which may in fact not coincide with H, and we can by no 
means assert that B is bounded. But, since A is distributive, we 
can say that B is a distributive operator on the lineal R(A). On using 
the previous notation A~} for B, we can write A-\(Ar) = 2, if x € H 
and A(A-!2) = v if x € R(A). 


127} THE INVERSE OPERATOR 387 


We shall prove later that, if equation (56) has only a zero solution 
and R(A) is the whole of H, the operator B = A- is bounded, i.e. 
A has a bounded inverse [cf. 97]. 

Let operator A have a bounded inverse, and let us pass to the 
conjugate operators in equations (53): 


(A71)*. A* = A*(A-1)* = E* = E. 


Hence it follows that 
(4-4y* = (48). (59) 


Formulae (53) require that the bounded operator B be inverse 
to A from the left and the right, and we have simply called it the 
bounded inverse operator in this case. 

Let us now consider bounded inverse operators only from the left 
or only from the right. 

We say that A has a bounded inverse from the left, or simply 
a left-hand inverse, if there exists a linear operator B such that 
BA = E. Similarly, if AC = E, C is called a bounded right-hand 
inverse. 

THEOREM 1. If A hasa left-hand inverse B and a right-hand inverse C, 
there can be only one left-hand and only one right-hand inverse, these 
being coincident, i.e. there exists the bounded inverse A~. 

By hypothesis, BA = E and AC = E, whence it follows that 
(BA)C =C and B(AC) = B. The left-hand sides of these equations 
coincide, so that B = O, i.e. every left-hand inverse coincides with 
every right-hand inverse, so that there can only be one left-hand 
inverse and only one right-hand inverse. 

THEOREM 2. If a unique left-hand inverse exists, a right-hand inverse 
also exists. If a unique right-hand inverse exists, a left-hand inverse 
also exists. In both cases both inverses are unique and coincide (by 
Theorem 1). 

Let us prove the first statement. Let A have a unique left-hand 
inverse B, i.e. BA = E. On multiplying from the left by <A, we 
get ABA =A or (AB — E)A = 0, where the zero on the right- 
hand side denotes the annihilation operator. On adding to both sides 
BA = E, we can write (AB — E + B)A = E. But B is the unique 
left-hand inverse by hypothesis, so that AB — E + B= B, whence 
AB = E, ie. B is also a right-hand inverse. 

A further remark: if A has two different left-hand (right-hand) 
inverse operators B and C, it has an infinite set of left-hand inverse 


388 HILBERT SPACE (127 


operators. For, if BA = E and CA = E, it is easily seen that the 
operator B + a(C — B) is a left-hand inverse for any choice of the 
number a: 


(B+taC—aB)A=BA+aCA—aBA=E+ak—-ak=E. 


It follows from the above results that the following four cases 
are conceivable: 

(I) there exists a unique left- and right-hand inverse; 

(II) no inverses exist, either left- or right-hand; 

(III) there exists an infinite set of left-hand inverses and none 
right-hand; 

(IV) there exists an infinite set of right-hand inverses and none 
left-hand. 

We shall see later that all these cases may be realized. A simple 
theoretical criterion will be given now, with the aid of which these 
cases can be distinguished. We consider the self-conjugate positive 
(non-negative) operators A*A and AA*. The lower bounds of these 
operators, denoted by m(A*A) and m(AA*), are greater than 
or equal to zero [126]. Suppose that there exists at least one left- 
hand inverse: BA = E, and let k > 0 be the norm of B. We have 
|| BAz || = || @||, and on the other hand, || BAz|| < k || Az |l, 
whence it follows that k|| Ax || > || || and || Ax || > 1/k || æ ||. 
Now (A*Az, x) > (1/k*) || x ||?, so that m(A*A) > 1/k?, i.e. m(A*A) > 
> 0. We now show that, conversely, if m(A*A) > 0, A has a bounded 
left-hand inverse. We shall prove below that, if the lower bound 
of a self-conjugate operator F is positive, F must have a bounded 
inverse [129]. On applying this to F = A*A, we see that there 
exists a bounded operator D such that DA*A = EF, i.e. (DA*)A = E, 
whence it follows that DA* is a bounded left-hand inverse for A. 
Similarly, the necessary and sufficient condition for the existence of 
at least one bounded right-hand inverse is that m(44*) > 0. 

It follows at once from these arguments that the necessary and 
sufficient conditions for the realization of the above four cases are: 


I. m(A*A)>0 and m(AA*)>0; 
II. m(A*A)=0 and m(AA*)=0; 
III. m(A*A)>0 and m(AA*)=0; 
IV. m(A*A)=0 and m(AA*)>0. 


128] SPECTRUM OF AN OPERATOR 389 


Notice that, if A commutes with A* (say A = A*, i.e. A is self- 
conjugate), cases III and IV cannot hold. On using (51) and (52), 
we can state the result in the first case as follows: the necessary 
and sufficient condition for the existence of a left- and right- 
hand inverse operator is that there exist a positive number / such 
that || Av || > liļs|| and || Ata || >l|ļæ]| for any element z. 
We have not made use of the distributiveness of B and C in any of 
the above. The only important point is that they be defined and 
bounded in the whole of H. In case I, the unique left- and right- 
hand inverse is necessarily a linear operator, as we have already 
seen. In case III there exists a linear operator B = DA*, inverse 
from the left, and similarly in case IV. We shall subsequently be 
concerned with an inverse distributive operator A~!, defined on 
R(A). Notice that, if BA = E, then A* B* = EF, ie. if case [II applies 
for A, case IV applies for A*. 

Inverse operators play a fundamental role when solving the equation 
Ax = y, where y is the given and g the required element. If there 
exists a left-hand inverse B, multiplication of both sides by B 
gives us the equation x = By, i.e. when the left-hand inverse exists, 
the solution, if there is one, is expressible as x = By, and is therefore 
unique. If a right-hand inverse C exists, the equation Az = y is 
obviously satisfied by substituting x = Cy, i.e. the existence of the 
right-hand inverse guarantees the existence of the solution z = Cy. 


128. Spectrum of an operator. There are two fundamental problems 
in the application of the theory of operators to mathematical analysis: 
solution of the homogeneous equation 


Ax=/x (60), ie (A—AE)x=0, (60,) 
and of the non-homogeneous equation 


Azx=ix+y, (61), i.e. (A —AB) x= y, (61,) 


where x is the required and y the given element, and A is a numerical 
parameter. We call 2 an eigenvalue of the operator A if (60) has 
solutions differing from the zero element; these solutions are called 
the eigenelements of operator A, corresponding to the eigenvalue. 
If 4 is an eigenvalue of A, and we associate the zero element with 
the corresponding eigenelements (the zero element satisfies (60) for 
any A), we can say, since equation (61) is linear and homogeneous, 


390 HILBERT SPACE [128 


and operator A is continuous, that our set of eigenelements (including 
the zero) forms a subspace. We shall call it the subspace of eigen- 
elements corresponding to the eigenvalue. 

If this subspace has a finite number of dimensions r, i.e. if the 
maximum number of linearly independent elements belonging to the 
subspace is equal to the finite number 7, we say that the correspond- 
ing eigenvalue A has rank or multiplicity r. If the subspace of eigen- 
elements is infinite-dimensional, we say that the rank of the eigen- 
value is equal to infinity. The following theorem holds in the case 
of a self-conjugate operator A: 

THEOREM 1. The eigenvalues of a self-conjugate operator are real 
and eigenelements corresponding to different eigenvalues are mutually 
orthogonal. 

Let 4 be an eigenvalue of the self-conjugate operator A and x a 
corresponding eigenelement (non-zero). On multiplying both sides of 
(60,) by x from the right, we get 


Since A is self-conjugate, the left-hand side is real, so that A is 
real. Let A’ and 4” be two distinct eigenvalues, and x’, x” correspond- 
ing eigenelements: 


Ag’ =A x; Ax” =A" x". 


On forming the right-hand scalar product of the first equation 
with x”, and the left-hand of the second with x’ and subtracting term 
by term, we obtain 


(Ax’, £") — (x', Av”) = (2 — 2”) (x, x"). 


The left-hand side vanishes, since A is self-conjugate, and A’ — 4” + 
# 0. Hence (x, x”) = 0, and the theorem is proved. 

Solving the non-homogeneous equation (61,) amounts to finding 
the operator (A — A E)-1, inverse to A — å E. If A is an eigenvalue 
of A, the homogeneous equation (60,) has solutions differing from 
the zero element, and by what was said in [127], the inverse (A — 
—AE)- certainly does not exist. If å is not an eigenvalue of A, 
the inverse (A — å E)-! does exist, but it may be either a bounded 
inverse or simply an inverse. Notice that the parameter 4 may be 
any complex number. We introduce the following definition. 


128] SPECTRUM OF AN OPERATOR 391 


DEFINITION. The value or point À (in the plane of the complex variable) 
is called a regular point of the operator A if A — AE has a bounded 
inverse: 

R, = (A — AB), (62) 


and this linear operator R,, defined for all regular points A, is called 
the resolvent of A. The spectrum of an operator A is the set of the points 
A which are not regular points of A. 

By what has been said, every eigenvalue of an operator A belongs 
to its spectrum. We shall see later that 2 that are not eigenvalues 
may also belong to the spectrum. 

Given any element y, if 2 is a regular point, the non-homogene- 
ous equation (61,) has a unique solution defined by 


x=(A—AE)y. (63) 


If å is not a regular point and is not an eigenvalue of A, (61,) also 
has a unique solution, if y belongs to the lineal R(A — 4 E). This 
lineal consists of the elements y given by 


y =(4 —2E)x (x€H), (64) 


when x runs over the whole of H. 

Thus the inverse operator (A — å E)-! is defined on the lineal 
R(A — 2 E), if 2 is not an eigenvalue. If å is also not a regular point 
of A, the operator (A — A £)-1 is called the resolvent of A. 

THEOREM 2. The elements R(A — AE) are orthogonal to all the 
solutions of the equation (A* — å E) z = 0. 

This assertion follows at once from the obvious equation 


((A — 2E) x, z) = (x, (A* — àE) 2). 


Notice that, if A is a self-conjugate operator and 2 an eigenvalue 
(which is real), it follows from Theorem 2 that the elements R(A — A E) 
are orthogonal to the eigenelements of A corresponding to the eigen- 
value A. We shall prove in the next section several theorems charact- 
erizing the spectrum of a self-conjugate operator. 

A further point must be mentioned in regard to the eigenelements 
of a self-conjugate operator A. As we have seen, the eigenelements 
corresponding to an eigenvalue 4 = A’ form a subspace. A complete 
orthonormal system can be introduced into this subspace. If the 
eigenvalue A = 4’ has rank 7, this orthonormal system will contain 


392 HILBERT SPACE {129 


r elements. We know that the eigenelements corresponding to different 
eigenvalues are mutually orthogonal. Hence, if we introduce an 
orthonormal system as indicated above into each of the subspaces 
corresponding to a fixed eigenvalue, we get an orthogonal system 
K in H. We shall say that the self-conjugate operator generates the 
orthonormal system K in H. This system is defined up to the choice 
of the complete orthonormal system in each of the subspaces 
in question. It may happen that A has no eigenvalues at all. In this 
case there will be no system K. We know that an orthonormal system 
may contain only a finite or denumerable set of elements. It follows 
at once from this that, if K has an infinite set of distinct eigenvalues, 
this set is denumerable. 

The orthonormal system K may be complete or incomplete in H. 
Its property of being complete or incomplete is easily seen to be 
independent of the choice of complete normalized system in the 
subspaces of eigenelements corresponding to the fixed eigenvalue. 
If K is a complete system, operator A is said to have a purely point 
spectrum. 


129. The spectrum of a self-conjugate operator. We shall discuss 
self-conjugate operators in this section. 

THEOREM 1. If A is not an eigenvalue of the self-conjugate operator A, 
(64) defines a lineal R(A — å E), complete in H. 

Let us suppose the contrary, ie. that R(A — å E) is not dense 
in H, i.e. that the closure of R(A — A E) leads to a subspace different 
from H. Now, by the theorem of [122], there exists a non-zero element 
Xo, orthogonal to this subspace and hence to R(A — A E), i.e. ((A — 
— 2 E) xz, x) = 0 for any x of H, or, since A is self-conjugate: 
(x, (A — å E) x) = 0. On putting x = (A — 1 E) ay, we get || (A — 
— 1 E) x, || = 0, ie. Ax, =A zo. If A is real, ie. 7 = A, 4 must be 
an eigenvalue of A, which contradicts the hypothesis. If A is not 
real, the equation Ar, = 2 Z shows that the non-real number 4 is 
an eigenvalue of the self-conjugate operator A, which is impossible, 
so that the theorem is proved. 

If 4 is a regular point, R(A — A E) coincides with H. This follows 
from the definition of regular point. If å is an eigenvalue, all the 
elements #(A — å E) are orthogonal to the corresponding eigen- 
elements of A, and the lineal R(A — å E) cannot be dense in H. 
We shall see later that, if A is not a regular point and is not an eigen- 
value, the lineal R(A — A E) is not H (it is dense in H}. 


129] THE SPECTRUM OF A SELF-CONJUGATE OPERATOR 393 


We now establish the necessary and sufficient condition for À to be 
regular. 

THEOREM 2. The necessary and sufficient condition for A to be a 
regular point of a self-conjugate operator A is that there exist a positive 
number p such that, for any x of H, 


| (A — 4B) «|| > pile. (65) 


Every non-real A, and every real A lying outside the interval [m, M], 
where m and M are the bounds of A, ts regular. 

Let us prove that (65) is necessary. Let å be a regular point. There 
now exists a bounded inverse operator (62). Writing g for its norm, 
we have 


| (4 — 4B) y|| < allyl, 


or, on putting y = (A — å E) x in this inequality, we arrive at (65) 
with p =  1/g. As regards the sufficiency of (65), it follows first of 
all from this condition that à is not an eigenvalue, so that the lineal 
R(A — i E), defined in Theorem 1, is dense in H. We shall show 
that it is closed, and therefore coincides with H. Suppose that the 
elements Yn = (A — å E) zn belong to R(A—AE) and y> y. 
We have to show that y € R(A — å E). By (65), we have || Yn — Ym || > 
> p || 8n — £m ||. The sequence y, is mutually convergent, and we 
can say, in view of the last inequality, that the sequence 2, is also 
mutually convergent, i.e. there exists an element z such that 2, => 2. 
It follows from yn = (A — à E) zn that Yn => y = (A — å E) xz, so 
that y € R(A — A E). The lineal R(4 — A E) therefore coincides with 
H and the inverse (A — 4 #)-1 of (A — A E) is defined throughout H. 
To show that 4 is a regular point, it remains to prove that (A — 4 E)-} 
is bounded. On putting z = (A — A E)-1y in (65), we get 


(A = 48y] < Sly. 


whence it follows that (A — A E)-! is bounded, and the first part 
of the theorem is proved. Let à = ø + ti, where +40. Putting 
(A — à E) x = y, we can write 


((A — 4B) 2,2) = (y, x) and ((A— AE) x, x)= (x, (A — AE) x) = (x, y). 
On subtracting the second from the first equation, we get 
(A — A) (æ, æ) = (y, a) — (x, y) 


2|r{{[z <] 2)|+ |, y) i, 


or 


394 HILBERT SPACE [129 


and, by using inequality (5), we arrive at 


2{r{[e\|<2|lyll, 


l.e. 
| (A — AE) x| > |r] || x]| 
(A=o + ti). 
We have arrived at (65) with p = | t |, and every non-real value 


of 2 is therefore regular. Now let A be real, but lie outside the interval 
[m, M]. Suppose say 4 > M; we show that (65) is now satisfied. 
We can write 

((A — AE) x, £) = (Ax, x) — 4 || æ |]2, 
or 


((A — 2B) x, x) = [(Az, x) — M || æ ||] — (4 — M) || a). 


It follows from the definition of the upper bound M of operator 
A that the difference in square brackets is non-positive. In addition, 
4 > M by hypothesis, and the last formula gives 


| (A — åB) z, 2)| > (å — M) |æ |}? 
On the other hand, we have 
|((4 — AB) x, 2)| < || (A — 48) æ ||: |æ |]. 
The last two inequalities together yield 
| (4 — a8) 2 || > (— M) |z|], 


whence (65) follows when 4 > M, which is what we set out to prove. 

This theorem has several corollaries. 

COROLLARY 1. The necessary and sufficient condition for A to belong 
to the spectrum is that there exist a sequence of normalized elements 
£n such that 

|| (A — 2B) x, || > 0. 


For, if there is such a sequence, condition (65) cannot be fulfilled 
with p> 0, so that A must belong to the spectrum. Conversely, 
if A belongs to the spectrum, (65) is not fulfilled for any p > 0, i.e. 
there exists a sequence of normalized elements z, such that || (A — 
— 2 E) zn ||—> 0. Notice that, if 2 is an eigenvalue, we can take the 
same element as £n, whatever the n, viz. any normalized eigenelement 
Ly. Now (A — A E) £n = 0 for any n. 

COROLLARY 2. If the lower bound m(A) > 0, A=0 lies outside 
[m, M], and A has a bounded inverse. We made use of this in [127]. 

COROLLARY 3. The set of regular points of the real à axis is open. 


129] THE SPECTRUM OF A SELF-CONJUGATE OPERATOR 395 

Let å be a regular point. We have to show that, given any suffi- 
ciently small « > 0, all the points 4 + € are also regular. By hypo- 
thesis, there exists a positive p such that (65) holds, whence 


I| (4 — (A+ £) E) || > |i (A — AB) a || — e || £ || > (p — 8) | æ ||; 


from which it follows that, when € < p, every point À + e is regular. 

COROLLARY 4. The points of the spectrum of a self-conjugate operator 
form a closed set. This follows at once from Corollary 3 [32]. 

THEOREM 3. The values à = m and A = M belong to the spectrum. 

Let us prove this for 4 = M, on the assumption that M > m. 
We introduce the self-conjugate operator B = A — mE, having the 
bounds 0 and M, = M — m > 0. Its norm is equal to M, [126]. 
It follows from the definition of upper bound that there exists a 
sequence z, of normalized elements such that (Bæn, £n) = M, — ôn, 
where 6, > 0 and ôn —> 0. We have 


|| Bu, — M, 2, ||? = || Bx, ||? — 2M, (Bz, £n) + Mi < 
< Mj — 2M, (M, — ôn) + Mj = 2M, ôn 


whence it follows that || (B — M,) 2, || — 0, and by Corollary 1 of 
Theorem 2, B — M, E = A — ME has no bounded inverse. 

We now show that A = m belongs to the spectrum. The bounds of 
the self-conjugate operator (— A) will be (—M) and (—m), where 
—M < —m, so that, by what has been proved, —A + mE has no 
bounded inverse, i.e. it does not satisfy condition (65), so that A — mE 
likewise does not satisfy this condition. 

We have seen above that, if 2 belongs to the spectrum, but is not 
an eigenvalue, the lineal R(A — å E) is everywhere dense in H. 
We shall show that in this case R(A — A E) is not the whole of H. 
This follows at once from the next theorem [cf. 97]: 

THEOREM 4. If R(A) is the whole of H, the inverse operator A~! is 
bounded. 

We notice first of all that, if R(A) is H, it follows from what was 
said at the start of this section that A= 0 is not an eigenvalue, 
ie. there exists the inverse A-1, defined throughout H. 

Let x be an element of H. We shall denote the element Az by 
the same letter with a prime, i.e. gx’ = Az, y’ = Ay and so on. It is 
readily seen that the operator A-1 has the following symmetry 


property: 
(471 x", y’) = (x, Art y’) , (66) 


where z’ and y’ are any elements of H. For, (66) is equivalent to the 


396 HILBERT SPACE [130 


equation (x, Ay) = (Az, y), which holds because A is self-conjugate. 
Given any choice of normalized element y’, expression (66) is a linear 
functional 1,’(x’) of x’. Given any fixed element x’, the numbers 
| ly’(a’) | are bounded for this family of functionals. For, | /,’(z’) | = 
= | (Atr, y’) | < || 4-12’ |}. Hence it follows that the norms of 
functionals (66) are bounded with || y’ || = 1 [100]. But these norms 
are equal to || A-1 y’ || [123], so that there exists a number q such 
that || A-ly’ || <q with || y’ || = 1, which is what we wanted to 
prove. 

The theorem obviously holds for the operator A — AE with any 
real A. The case of non-real 2 was discussed above. We shall investigate 
the spectrum of a self-conjugate operator in more detail in a sub- 
sequent section devoted to unbounded operators. 


130. The resolvent. If 2 is not an eigenvalue of A, the resolvent 


of A exists: 
R, = (A — AB). 


It is defined on #(A — A E) and transforms this lineal one-to-one 
into H. It follows from the definition of inverse operator that æ = 0 
if x belongs to R(A — å E) and Ry x = 0. 

We shall consider below the case when å is a regular point of A. 
In this case R(A — A E) is H, and R, is a bounded operator defined 
throughout H. 

Let us prove the following two formulae (for regular å and p): 


* = Bj, 
| (67) 
R, — R, = (u — 1) R, Rh. 


If 4 is not real, 4 is also not real, and hence is also a regular point. 
We can therefore assert that, for real and non-real A, any elements x’ 
and y’ of H can be written as 


av =(A—AB)x; y =(A—AE)y. 
Hence it follows that 
(R 2’, y’) = (x, (4 — 2E) y) = ((A — AE) z, y) = (#’, Riy’), 


and the first of (67) is an immediate consequence of this. The definition 
of resolvent implies that 


R, = R, (4 — uE) Rz; Ry x= kh, (A — 1E) R2. 


On subtracting term by term, we arrive at the second of (67). 


131] SEQUENCES OF OPERATORS 397 


131. Sequences of operators. Everything said about sequences of 
linear operators in B spaces [104] holds for H. Let us recall the basic 
facts and make some additions. The convergence in norm of a sequence 
of linear operators A, to a linear operator A is defined by the con- 
dition || 4 — A, || 0 as n— œ. The necessary and sufficient 
condition for this is that || An — Am || —> 0 as n and m— œ. 

Strong convergence (or simply convergence) is defined by the fact 
that Anz => Ax for any x € H. The norms || A, || are bounded. The 
necessary and sufficient condition for strong convergence is || 4,2 — 
— A, || —> 0 as n and m— œ, for any x € H. Convergence in norm 
implies strong convergence. If An—> A, B,— B (in the sense of 
strong convergence or of convergence in norm) and the numbers 
an—> a, then a,4,-—> aA, 4, + Bn—> A + Band B,A,—> BA. 

If A, are self-conjugate operators, and A,—» A, then A is also 
self-conjugate. For, (Anz, x) is real for any n and any x € H. Hence 
it follows that (Az, x) = lim (Anz, 2) is real for any x € H, so that 


A is a self-conjugate operator. 
Since we possess the concept of limit, we can consider infinite 
series of linear operators in H, e.g. 


By + B, eB ks 


and speak of their convergence in one sense or another. The following 
example will be important later. Let A be a linear operator and 
|| A |] =@ < 1. We form the series: 


S=H-+aA+a?A?-+..., (68) 


where a is a complex number. On writing Sn for the operator equal 
to the sum of the first n terms of this series, we have 


[Snap = Sall An i antl Antl 1 gnt2 Ant2 4. | p atp- Antp-t || ; 
whence, assuming |a| < 1, 
n+i 
[Sito — Sall <a H +... g 


so that || Snip — Sn || > 0 as n— œ for any p > 0, i.e. series (68) 
is convergent in norm when |a| < 1. Since the upper bound of 
|| Sn4p — Sn || does not contain a, series (68) is said to be uniformly 
convergent with respect to a for |a| < 1. Since || A|| =q< 1, 
we can say that series (68) is uniformly convergent in norm with 
respect to a if |a] < 1 +e, where « > 0 is chosen so small that 


(l+teqg<l. 


398 HILBERT SPACE [132 


On multiplying series (68) by (E — a A) and taking into account 
what was said previously about passage to the limit for a sequence 
of operators, we get (E — a A) S = S(E — a A) = E, i.e. the sum of 
series (68) is the bounded inverse of the operator (E — a A) for 
ja|<1l+e, ie. S= (E — a A)-1, 

The proof of the following statement is similar to the above: if the 
norms of operators A; do not exceed positive numbers ôx, which form 
a convergent series, the series 


A=A,+A,+... 


is convergent in norm, and the norm of A is not greater than the 
sum of the series formed from the ôx. The last assertion follows from 
the fact that, if the norm of A were greater than this sum, the norm 
of the operator 


S SA A+... +A 


would also be greater than this sum for sufficiently large n, and this 
contradicts the obvious inequality 


[Sall <All + Il Aall + --- + Anll <4 + 8 -o H on. 


A proposition similar to the above obviously also holds for normed 
spaces. 


132. Weak convergence. Since we possess the general form of 
linear functional in H [123], the weak convergence x “% a is equi- 
valent to the fact that (£n, Y) —> (£a y) for any y € H. We recall 
that z,-“.2, implies the existence of a number m > 0 such that 
|| £n || < m for all values of n. Further, since the conjugate space H* 
coincides with H, every bounded set in H is weakly compact, and 
H has weak completeness, i.e. if (£n — £m, y) > 0 as n and m— co 
for any y € H, the sequence gn is weakly convergent. We also know 
that, if 2, “2, and Aisa linear operator, Az, “» Az. Let us now 
show that, if 2, (k = 1,2, ...) is an orthonormal system, complete 
in H, and || 2, || < m, the proof that x, “2, only requires a proof 
that (£n, Zk) > (£o Za) (k = 1, 2, ...). For, let (£n, Zk) —> (£o, 2x) (k = 
= 1,2,...). Any element y € H can be written as 


Y = > brx 
k=1 


where 


oo 


Zlat = |y] 


k=l 


132] WEAK CONVERGENCE 399 


We write 


N oo 
(En — Xos Y) = (x, — Xo» > b, 2) + (£n — To, > by 2) , 
k=1 k=N+4+1 
and take any given e > 0. Since || z, || < m, we have [121]: 


| (n — To» > beza) | < || np — £o lj- |] > br Z|] < 
kůN+1 koN+1 


<(mi+ laD] & tar 


and we can fix an N such that the right-hand side of this inequality 
is < e/2. Now, 


N 
Hn — Bo 9] <| X (En — os By 24) | 


Since (£n, Zk) —> (£o, Zk) for all sufficiently large n, the first term 
on the right-hand side is < ¢/2 and |] (£n — £o y) || < £, whence it 
follows that (£n, Y) —> (£o, Y). 

Let us prove the following theorem. 

THEOREM 1. If 2%“ zo and Yn = Yo, then (En, Yn) > (Lo Yo) and 
(Yn:tn) > (Yo £o). It is sufficient to show that (£n, Yn) > (£o Yo). The 
seco nd assertion is obtained by interchanging the elements. We can 
write 

| (£o, Yo) T (En Yn) | < | (Zo; Yo) = (En: Yo) | T | (Zn Yo) E (En Yn) | , 
or 
| (Zos Yo) — (Er Yn) | < | (Los Yo) — (En Yo) | + | (En Yo — Yn) |- 

On recalling that xn “4 £, implies the existence of an m > 0 such 
that || x, || < m, and using Buniakowski’s inequality, we get 

| (Zos Yo) — (Lr Yn) | < mM || Yo — Yn ll + | (£0 Yo) — (Lr Yo) |- 

The first term on the right —> 0, since Yn => Yp whilst the second 
— 0 because 2, % zo Therefore (£n, Yn) —> (Lo Yo), and the theorem 
is proved. 

THEOREM 2. If x, “> x and || £n || > || £o ||, then %2=> Zp. We have 

| To — Xp ll? i I To Il? oF IEZ l? _ (Zn: Xo) ome (Zo; Lp) + 

It follows from the hypothesis that (£n, £o) —> || £o lP, (Lo Zn) > 
—> || zo ||? and || £a ||? — || £o ||?, so that || £o — a, ||? 0, which is 
what we needed to prove. 

As indicated in [101], this theorem also holds for certain B spaces. 


400 HILBERT SPACE [133 


133. Completely continuous operators. We have had the definition 
of completely continuous operator for a B space and hence for H. 
A completely continuous operator in H is a linear operator such that 
it transforms every set bounded in H into a compact set. 

We know that every linear operator transforms a compact set into 
a compact set. Notice that the operator of the identity transformation 
is not completely continuous. It transforms the sphere ||s||< 1 
(a bounded set) one-to-one into itself, and such a sphere is not compact. 
To see this, we only need to take an infinite orthonormal sequence 
of elements zx (k = 1, 2, ...). It is bounded, since || zx || = 1, but not 
compact, because || Zp — 2, || = V2 for p # q. 

Two new definitions of completely continuous operator will be 
given below, and their equivalence to the above fundamental def- 
inition will be proved. A simple preliminary remark is required. 

If, given two sequences 2, and Yn, one is weakly and the other strongly 
convergent, to £) and Yp and if A is a bounded linear operator, 
we have 

lim (Az, Yn) = (Aap, Yo) - (69) 


n= oa 


This follows at once from Theorem 1 [132] and the fact that, if 
£n > Zo then At, > Ax, and if %,=> £o then Az, => Az, We shall 
now give two new definitions of completely continuous operator. 

DEFINITION 1. A linear operator A is said to be completely con- 
tinuous if (69) holds for any sequences x, and Yn, weakly convergent to 
Ly and Yo. 

DEFINITION 2. A linear operator A is said to be completely continuous 
if tS xy implies At, > Ato. 

Let us show that these definitions are equivalent. Let A satisfy 
condition (69) for the weakly convergent sequences £n and yn. We can 
write 


|| Ax, — Ax, ||? = (48n Ax, — Ax) — (Ao, Ax, — AX). 


if 4 Zo then Az, — Az, > 0, and both terms on the right- 
hand side tend to zero by (69), i.e. || Av, — Az, || > 0 or Atn => Ago 
Thus the second definition follows from the first. Now suppose that 
In “> £, implies that Az, =» Ax». Formula (69) now follows at once 
from Theorem 1. Having proved that these two definitions are 
equivalent, they must both be equivalent to the basic definition if 


133] COMPLETELY CONTINUOUS OPERATORS 401 


it is shown that Definition 2 is equivalent to the basic definition. 
Let A satisfy Definition 2 and 2, be a bounded sequence of elements. 
We can choose a subsequence z, such that En, a, and hence, 
by Definition 2, Azn, = Az, i.e. the set Az, is compact, and the basic 
definition follows from Definition 2. Suppose conversely that A 
satisfies the basic definition, and let z, > Zo We have to show that 
Az, => Ax. We use reductio ad absurdum. Let Ax, not = AX, 
i.e. there exists a subsequence of subscripts such that || Azn, — 
— Az, || > a> 0. By the basic definition, the set Azn, is compact, 
and we can assume that Az,, is strongly convergent to some element 
x’, which, since || Azn, — Az, || > a > 0, must differ from Az». 
But zn, => £o, and consequently Az, “> Axo, whilst by the foregoing, 
Atn, = X # Azo, and all the more, Attn, “> x’ # Ax, We have 
arrived at a contradiction. 

Thus the new definitions of completely continuous operator are 
equivalent to the basic definition. We shall explain later the concepts 
of weak convergence in J, and Ly. 

THEOREM. If A is a linear operator and A*A is completely continuous, 
A is also completely continuous. Let £n (n = 1, 2, ...) be a bounded 
sequence of elements (|| £n || < a). By hypothesis, A*Az, is compact, 
i.e. there exists a convergent subsequence A*Az,,. Let us show that 
Azn, is also a convergent subsequence, whence the theorem will 
follow. We have 


|| A £n, — Aan, ||? = (A* A (En, — Eni)» Tay — Tn.) < 
< || A* Arq, — A* An || || Gay — Ln, || < 2a || A* Arn, — A* Azn ||. 


In view of the convergence of the sequence A* Azn, the right-hand 
side —> 0 as ng and n,— œ, so that || Az,, — Az, || — 0, ie. Atn, 
is a convergent sequence. 

COROLLARY. If A is completely continuous, A* is also completely 
continuous. 

If A is completely continuous, AA* = (A*)*A* is completely 
continuous; but now, by Theorem 2, applied to A*, A* is also com- 
pletely continuous. 

Let us recall the following property of sequences of operators: 
if a sequence A, of completely continuous operators is convergent 
in norm to a linear operator A, A is also a completely continuous 
operator [106]. A special class of completely continuous operators 
must be mentioned. 


402 HILBERT SPACE [134 


DEFINITION. A linear operator D is said to be finite-dimensional if 
it can be written in the form 


m 
Dx = Y (T, Vy) Uy, (70) 
k=1 
where u, and Up (k = 1, 2, ..., m) are fixed elements of H. 


It may easily be seen that a finite-dimensional operator is com- 
pletely continuous. For, 2, ay £o implies that (£n, ve) —> (Zo Vk), and 
by (70), Dzn => Dzy. 

It follows at once from what has been said that, if A, is a sequence 
of finite-dimensional operators, tending in norm to the linear operator 
A, then A is completely continuous. 

It will be shown in the next section that every completely con- 
tinuous operator can be written as the limit in norm of a sequence of 
finite-dimensional operators. 


134. Spaces H and /,. Let 
Zis Bas Žas ++ (71) 


be a complete orthonormal system in H. By using it, we can map H 
one-to-one into space l, the elements of which are infinite sequences 
of complex numbers (&,, n ...), on condition that the series 


Sk? (72) 
k=1 


is convergent [121]. Any element x € H is characterized by its 
Fourier coefficients: &, = (x, Zk), and we have the form 


r= Shy. (73) 
k=1 


Conversely, if an element (&,, &, ...) of J, is given, series (73) is 
convergent in H and yields the corresponding element of H. This 
correspondence is one-to-one, the scalar product in H being equal to 
the scalar product of the corresponding elements of l, [60, 121]: 


(z, y) = > Ek Ng , 
k=l 
where x corresponds to (é,, &, ...) and y to (n na .--). Thus || x || 


is equal to the norm of the corresponding element in l, and con- 
vergence in H and l, is equivalent. We thus have an isomorphic 


134] SPACES H AND h 403 


mapping of H into /,. The following elements of l, correspond to the 
elements 2, of the orthonormal system (71): 


(1,0,0,0,...); (0,1,0,0,...); (0,0,1,0,...); 


Let &(&, &,...) be an element of J, and &(é,, &, ..., én, O.) 
be the cut-off element, in which the first n components are equal 
to the corresponding components of ¢, and the remainder equal to 
zero. We have 


Wé—Enll?= S lif, 
k=n+1 
and, since series (72) is convergent, EM E in l. 


Let A be a linear operator in H. In view of its continuity and (73), 
we can write 


k=l 
The components (7, %,.--) of the element of l, corresponding 


to the element y, are defined by 
m = (Y, 2) = D ér (Az z). (15) 
k=l 


We have made use here of the continuity of the scalar product. 
On introducing the numbers 


Qiy = (AZt 21)» (76) 


a linear operator A in H is seen to correspond with the operator 
in 1: 


N = > UK Si (77) 
k=l 
= 152,23; 


which is defined by an infinite matrix with elements ay, = (AZz,, 2). 
The conjugate operator A* corresponds to the matrix with elements 


: ağ = (A* Zk 21) = (Zp Az) = (Azz, Zr) 
i.e. 
ii, = Qi- (78) 


A self-conjugate operator is characterized by the equation 


Ox: = ig: (79) 


404 HILBERT SPACE (134 


We introduce the set L of elements of H expressible as 


r= S E, Zi 


Ms 


X 
j 
_ 


where &, are any complex numbers and m is a fixed positive integer. 
We have & = (x, 2), and L is easily shown to be a subspace. The 
subspace M orthogonal to it is obviously the set of elements x 
expressible as 


5 Ek Zt 


k=m+1 
where é, are complex numbers such that the series 
> lé! (80) 
k=m+1 


is convergent. Space H can be written as [122]: 


H=L@M. (81) 
Writing P and Py for the projectors into L and M, we have 
E=P,+ Py. (82) 


Let A be a linear operator. We bring in the two operators: 
A,=P,A; A,=P,yA. (83) 
By (82): A = A, + 4, Since P, Ax € L for any x € H: 


m 
P, Ax = D % Zk 
k=l 


where 
a, = (P, Ax, z) = (Ax, Pr 2g) = (Ax, Zy) = (£, A* 2). 
i.e. 
m 
P, Ax = A,x = > (x, AF 2,.) Zio (84) 
k=l 


whence it follows that P; A = A, is a finite-dimensional operator. 
Similarly, we have 


Ae= X (Atay = X (æ, A* z4) %- (85) 
k=m+1 k=m+1 
Thus the element A, x corresponds to an element (£,, &, ...), the 


components of which are given by 


&,=0 for k <m and &, = (Az, z) = (x, A*z,) for k >m. (86) 


135] LINEAR EQUATIONS IN COMPLETELY CONTINUOUS OPERATORS 405 


Now let A be a completely continuous operator, and U a set of 
normalized elements (|| z || = 1). Now, if v € U, Ax is a compact set, 
and hence the corresponding set in l, is compact. The components 
of the elements of this set are given by ë = (Az, zx), and since it is 
compact, we can say that there exists for any normalized x a positive 
number C such that 


oo 


> (Az, z) <C 

k=l 
and that, given any e > 0, there exists a positive integer m, such 
that [92] 


oo 


> Ar z) <e. 


k=metl 


But it follows from (85) that 


oo 


|| Ap z|? = > (Az, zy) |2, 
=m+ł} 
so that, given any £ > 0, there exists an m = n, such that || A, æ || < 
< e for || æ || = 1, i.e. || A, || < £. We have arrived at the following 
theorem. 

THEOREM 1. If A is a completely continuous operator and € > 0 is 
any given number, there exists a positive integer m such that || A, || < €e, 
where A, is the operator defined above. 

If we take a sequence of positive numbers en tending to zero, we 
get a sequence of finite-dimensional operators Af such that || 4 — 

A® || tends to zero, i.e. every completely continuous operator 
is a limit in norm of finite-dimensional operators. 

On recalling what was said in [133], we can assert that the following 
definition of completely continuous operator is equivalent to the 
original (a linear operator transforming every bounded set into a 
compact set). 

DEFINITION. A linear operator is said to be completely continuous 
if it is the limit in norm of a sequence of finite-dimensional operators. 


135. Linear equations in completely continuous operators. Let 
us consider the solubility in space H of equations of the form 

x — Át =y, (87) 

g — A*r=y, (88) 


where A is a completely continuous operator, A* the conjugate to 4, 


406 HILBERT SPACE [135 


y a givenelementand x the required element of H. As was shown by 
Riesz, the fundamental theorems of the theory of integral equations 
(the Fredholm theorems) remain valid for equations (87) and (88) in B 
spaces as well as in H space (as proved in [107]). Let us investigate 
these equations in H. 

We fix the number m appearing in the formation of operators A, 
and A, of the previous section so as to have || A, || < 1. Then || A} || < 
< 1. Thus 


4i = |] Agi] < 1, (89) 

and equations (87) and (88) can be rewritten as 
(E — 4,)x — A4 t =y, (90) 
(E — Až) x — Až t = y. (91) 


By (89), the operators (E — A,) and (E — Až) have bounded 
inverses [131]. 
We introduce the following notation: 


t= (E — A) x; 9 =(E — Af) y. (92) 
We rewrite (90) in terms of 2 instead of v, and apply the operator 

(E — Af)-1 to both sides of (91). This gives us 
t— Br=y, (93) 

where ¥ is the given element and 
B= A (E — A,); B* = (E — A$)” Af, 

B* being the conjugate to B. Equation (94) is equivalent to (91), 
and solving (90) amounts to solving (93) and using the formula 
x = (E — A,)-1 x. Solving equations (90) and (91) thus amounts to 
solving (93) and (94). Let us also write the corresponding homo- 


geneous equations: 


eee (95) 


x— B¥x=0. 

The operator B = P, A(E — A,)~+ is finite-dimensional, and Bg 
is given by (84) after A has been replaced by A(H — A,)-!. The 
matrix corresponding to operator B in l, will have the elements 

Oy = (P, A(E — A,)* 2, z) = (A(E — Aa) >z Pr Zx). (96) 


But P; zk = 0 for k > m, so that ayı = 0 for k >m. Let é, and 
ni be the components of elements of l, corresponding to the elements 


135] LINEAR EQUATIONS IN COMPLETELY CONTINUOUS OPERATORS 407 


z and y of H. Equation (93) becomes in J,: 


=a aki & = = fw (97) 
(k= T 72h. m) 
é, = ks (98) 


(k=m+1,m+2,...), 


where čą are required, and 7, are given numbers. Hence all the Ex 
are known for k > m, and the solution of (93) amounts in J, to solving 


a system of m equations with m unknowns & (k = 1, 2, ..., m): 
- Šou Ë= M + > Akani. (99) 
l=m+1 


The matrix corresponding to B* is až; = @ [134], so that (94) 
has the form in }: 


m 
g Dar &= Nie (100) 
E 1, 2,...) 
where ¿and 7, are the components of the elements of l, corresponding 
to x and y of H. 
Each solution (£0, £0, ..., £9) of the first m equations of the 
system 


= Qik Èi = Ne (101) 


(ke Zie. «m) 
leads to a corresponding single definite solution (E9, HD.. +1 $m» 
Ej...) of the entire system (100), whatever the remaining 9, 


(k=m+1, m+2, ...), in which the remaining unknowns éx 
(k=m+1, m+ 2, ...) are defined by the formulae 


Ey = Te + Sin $ (102) 


(k=m+1, MES J 


Notice that the &, obtained from (98) and (99), and the &, from 
(101) and (102), are such that the series with general terms | é, |? 
and | x |? are convergent. This follows at once from (98) for x, and 
from (102) for x, if we take into account the convergence over k 
of the series with general terms | h| and | |?. The last follows 
from the fact that, by (96): 


Gy, = ((E — Af)! A* P, Zn Zy). 


408 HILBERT SPACE [135 


In the homogeneous case we have to put yk = Yx = 0. The homoge- 
neous system (99) can be rewritten as 


-Žau &=0 (103) 
uE ae « m) 


and &, = 0 for k > m; and system (101) as 


- Sanh =0 (104) 


u= Tren om) 


= San, for k >m. (105) 
i=l 


Notice that the linearly independent solutions of the finite homo- 
geneous system (104) generate, by (105), linearly independent solutions 
of the entire homogeneous system of an infinite number of equations, 
corresponding to the homogeneous equation (94) in H. The linearly 
dependent solutions of system (104) generate linearly dependent 
solutions of the entire system. On recalling the basic results regarding 
the solutions of systems of equations, and the fact that the matrices 
of the coefficients of systems (103) and (104) have the same rank, 
we get the following theorem: 

THEOREM 1. Non-homogeneous equations (87) and (88) are soluble 
with any right-hand sides y when and only when the corresponding 
homogeneous equations (y = 0) have only the zero solution. In this case 
the solutions of (87) and (88) are unique for any given y. The homogeneous 
equations x — Ax = 0 and x — A*x = 0 have the same finite number 
of linearly independent solutions. 

We now consider non-homogeneous equation (87) in the case when 
the homogeneous equation has non-zero solutions; in fact, we prove 
the following theorem. 

THEOREM 2. The necessary and sufficient condition for the non-homo- 
geneous equation (87) to have a solution in this case is that the right-hand 
side y be orthogonal to all the solutions of the homogeneous equation 


x—A*¥xu=0. (106) 


Necessity. We shall give the proof without having recourse to l. 
Let (87) have a solution 2, ie. z) — Av) = y, and let z be any 
given solution of (106), i.e. z — A*z = 0. We have to show that 
(y, 2) = 0. We have 


(Y, 2) = (Wy — AX, 2) = (To, 2 — A*z) = (zo, 0) = 0. 


135] LINEAR EQUATIONS IN COMPLETELY CONTINUOUS OPERATORS 409 


Sufficiency. Given that y is orthogonal to all the solutions of (106), 
we have to show that (87) has a solution. On passing to l, we have 
by hypothesis: 


> mb = 9, (107) 
k=! 
where (&, &, .-.,&m) is any solution of system (104) and the &, 


for k >m are given by (105). On substituting these expressions 
for £x when k > m, we can rewrite (107) as 


m æ = 
(m aF > akM) k = 0. 
k=l l=m+1 
Since the sums in the curved brackets are the right-hand sides of 
equations (99), whilst (&,, 2, ..., Ëm) ig any solution of system (104), 
we can say that system (99) has a solution [II]; 15], so that equation 
(87) has a solution, and the theorem is proved. 
We now consider the equation 


z — u Az =y, (108) 


where A is a completely continuous operator and u is a complex 
parameter. The operator „A is also completely continuous, and the 
theorems proved above are applicable to equation (108). In particular, 
(108) is soluble with any y (and uniquely so), if the homogeneous 
equation 

x—pAx=0 or At=Az (=>) (109) 

m 

has only the zero solution (this is obvious with u = 0). If (109) has 
non-zero solutions, the corresponding 4 is an eigenvalue of the 
operator A. We now prove the following: 

THEOREM 3. There can exist only a finite number of eigenvalues, 
satisfying the condition | à | > r, where r is any given positive number. 
In other words, we have to show that there can only exist a finite 
number of values of u satisfying the condition | u | < 1/r for which 
(109) has non-zero solutions. The proof of this assertion is directly 
bound up with the construction that we used in proving Theorem 1. 

As in Theorem 1, we put 


where we fix m so large that || A, ||/r = q < 1. The operator (E — 
— u A,) now has a bounded inverse for | u | < l/r, and it is express- 


410 HILBERT SPACE [136 


ible by the series 
(E — pA.) t= H+ på, +e AH.. (110) 


which is uniformly convergent in norm with respect to u for | u | < 
< (1/r) + e [131], where e is a sufficiently small positive number. 
The values of u for which the equation has non-zero solutions are 
found by equating to zero the determinant of system (103), i.e. the 
determinant 4 with elements 6,; — arp where ôx = 0 for k # l, and 
ôx = 1 for k =l, and 


ay = (PL wA(E — uA) Zi 2y). 


In view of the convergence of series (110), our remarks about 
passage to the limit for a sequence of operators, and the continuity 
of the scalar product, we can assert that ax are regular functions in 
the circle | u| < 1/r. The determinant 4 obviously has the same 
property, so that the equation 4 = 0 has only a finite number of 
roots satisfying | u | < 1/r, which is what we set out to prove. 

This last theorem can be alternatively stated as: the eigenvalues 
A of a completely continuous operator can only have À = 0 as a 
limit point. 

It follows from what has been said that the rank of any eigen- 
value A satisfying | 4 | < 1/r does not exceed the number m in system 
(103) with the condition || A, ||/7 = q < 1. If A is not self-conjugate, 
it may not have any eigenvalues [IV; 13]. 


136. Completely continuous self-conjugate operators. We invest- 
igated the properties of the spectrum and the expansion in eigen- 
functions of a completely continuous self-conjugate operator in 
[IV; 38, 39]. All the proofs can be carried over without change to 
space H. But we have postulated that H is complete, and this fact 
was not used in the proofs of Volume IV. Thus new results may be 
obtained for H. Let us first state a theorem which is obtained from 
the results of Volume IV. Remember that all the eigenvalues of a 
self-conjugate operator are real. 

THEOREM 1. Every self-conjugate completely continuous operator A, 
‘different from the annihilation operator, has atleast one eigenvalue different 
from zero. All the eigenvalues of A have a finite rank and only a finite 
number of eigenvalues can lie outside any interval [—e, +e], where 
e > 0. Every element of the form Ax (x € H) can be expanded as a 
Fourier series in the orthonormal system of etgenelements 2, corres- 


136] COMPLETELY CONTINUOUS SELF-CONJUGATE OPERATORS 411 


ponding to the eigenvalues that differ from zero: 


Az = S (Ax, £y) X= 20 Ly) Ay, Ly. (111) 
k 


Sum (111) can contain either a finite or an infinite number of 
terms. Further, it may be recalled that the eigenvalues 2, and eigen- 
elements x, that form the orthonormal system are obtained from 
the solution of successive extremal problems for the quadratic form 
(Az, x). This provides the basis for the proof of the fundamental theo- 
rem in Volume IV. 

Suppose that sum (111) contains an infinite number of terms. 
Let x be any element of H. We form the difference: 


Z2=2 — Ñ (2, ty) e (112) 


k=l 


The series written is convergent [121]. By (111), 


A[z — > (z, 2) 2] = 0. 
= 


It will be seen from this that z satisfies the equation 
Az=0, for Az = Oz, (113) 


i.e. z is either the zero element or the eigenelement of A correspond- 
ing to the eigenvalue A = 0. Let 2, 2,, ... be a complete orthonormal 
system of eigenelements corresponding to A = 0. If A = 0 is not an 
eigen value, there will be no such elements. If å = 0 is an eigenvalue, 
it can be either of finite or infinite rank. Since z is a solution of equa- 
tion (113), we can say that 


= L, Ly) Ly = ae 1% (114) 
k=l 
where 


n= (z = > (©, Eh) Lys zı) 
k=1 


or, since (£p, 2;) = 0 [128], we get cp = (x, zi), and it follows from 
(114) that any element z can be expanded in a Fourier series in 
eigenelements of A, these elements being those corresponding to the 
eigenvalue 4 = 0. We have thus proved the following. 

THEOREM 2. The orthonormal system of eigenelements of a completely 
continuous self-conjugate operator is a complete system. 


412 HILBERT SPACE [136 


In other words, we can say, using the terminology of [128], that 
a completely continuous self-conjugate operator has a purely point 
specirum. 

The whole of the above discussion is applicable to the case when 
sum (111) consists of a finite number of terms. If A = 0 is not an 
eigenvalue, sum (111) consists of an infinite number of terms (axiom 3), 
and we have for any element of H: 


(£, £y) Ly. (115) 


Me 


C= 
k 


1 


Note. As in the case of integral equations [IV; 29], we have 
the following result for a self-conjugate completely continuous opera- 
tor A. If A is not an eigenvalue or zero, the non-homogeneous equation 


Ax = he +y (116) 


has a unique solution with any given y, defined by 
= AY, Zt) 
= > es aay) 


If A is an eigenvalue and the condition for the equation to be 
soluble is fulfilled, i.e. y is orthogonal to all the corresponding eigen- 
elements, the general solution of (116) is given by (117), in which 
all the factors for the z, in which the denominator vanishes, have 
to be replaced by arbitrary constants. Suppose that there are both 
positive and negative eigenvalues. We enumerate the former, denoted 
by 4}, in order of non-increasing absolute value, and similarly for 
the latter, denoted by 4; , and let xy and x, denote the corresponding 
eigenfunctions. We have, in view of expansion (111): 


(Az, 2) = Sap (a, zH)? + X rel (2, owl, (118) 
k k 


whence the new statement follows at once of the extremal properties 
of A, and 2, mention of which was made above [cf. IV; 26]. 

THEOREM 3. The eigenvalue Ay is the greatest value of (Ax, x) when 
ia || = 1, and it is attained when z = x}, whilst the eigenvalue Az 
(n > 1) is the greatest value of (Ax, x) on condition that 


e| = 2 and (x, a") = (x, zł) =... = (x, æf) = 0, 


and it is attained when x = x7. 
Similarly, Ay is the least value of (Ax, x) when || x || = 1 and is 
attained when x = x1, whilst A, (n > 1) is the least value of (Ax, x) 


136] COMPLETELY CONTINUOUS SELF-CONJUGATE OPERATORS 413 


on condition that 


|z| = 1 and (x, a7) = (x, t7) =... = (£, Ta) =0, 


and is attained when x = xp. 
Let us now prove Courant’s theorem [IV; 187] for space H. 
THEOREM 4. Let 2,2, ..-,2n-, be any fixed elements of H and 
MZ, Za, - ++) Zn) be the strict upper bound of the values of (Ax, x) on 
condition that 


z| = 1 and (x, z,) = (x, 2.) =. . . = (%,Z,-4) = 0. (119) 


Now, up is the least of the numbers m(2, Zy «++, Zn) for all possible 
choices of elements 21, 2, ..-,%n—. The proof is similar to that of 
[IV; 187]. We have m(£}, tà, ...,¢¢_,) = wt, and it remains for us 
to show that, given any choice of zp (k = 1, 2, ..., n — 1), 


MZ, Zas- + -s Zn—1) > Ma- (120) 


We shall seek the element x subject to conditions (119) in the 
form 


n 
r= Soe. (121) 
k=1 
Conditions (119) may be written as the following equations for cx: 
n 
> cr (Lk Z) = 0, (s=1,2,... n — 1) (122) 
kæl 
n 
Fla =1. (123) 
k=1 


The non-homogeneous system (122) of (n — 1) equations with n 
unknowns cg has non-zero solutions. By adding a constant factor to 
such a solution, we can also satisfy condition (123). We have thus 
found the element of form (121) satisfying conditions (119). We have 
for this element: 


n n n 
(Ax, 2) = (Sg tk, X ue te) = X Ak lexl?- 
k=l k=1 k=1 


On observing that uf > us >... > pn, and using (123), we get 
(Az, x) > Ba, whilst x satisfies conditions (119). All the more, m(z,, 
Z% ..+,%n—4), Which is equal to the strict upper bound of (Az, x) 
with conditions (119), is not less than pz. Hence we have inequality 
(120) and the theorem is proved. The theorem for 4, is similar. 


414 HILBERT SPACE (136 


Note. It may easily be shown that the strict upper bound m(2,, 
Zy ++) Zn) of (Ax, x) is attained on an element x, satisfying conditions 
(119). 

For, there exists by hypothesis a sequence of elements yn, satisfy- 
ing conditions (119), such that (Ayn, Yn) > m(2,, 2, ».-, Zn). Since 
|| Yn || = 1, we can assume that the Yn are weakly convergent to some 
element x), where the weak convergence implies [132]: 


[[zoll < 1 and (£o, 21) = (Zo Z2) = - - . = (Zos Zn—1) = 0. 


In view of the complete continuity of A, we have (AZo, xy) = 
= M(2,, 2g, +++) 27-4). It remains to show that || x, || = 1. It follows 
from m(2,, Zy +--+; Zn) > Ma that m(z,, 2, +. +; Zn-1)>0 and || £a [|> 0. 
If || zo || < 1, by introducing the normalized element y = 2y/|| £o ||; 
satisfying conditions (119), we should have obtained 


(Ayo; Yo) a rar A(2Xp, To) = 


TU Ry, Zoya oe Rp 
(21 aves not) > M(zy Zg. + -Zn—1)- 
Įjæoll 








But, by the definition of m(z,, Zz ...,%n-), we have (Ayp Yo) < 
< Mz, Zz +++) Zn). The contradiction obtained shows that || x, || = 
= 1, and our assertion that (Ax, x) attains its strict upper bound 
M21, Z% «++, 2n-4) is proved. Using Theorem 2 of [132], we can say 
that Yn => Zo. 

The theorem is applied when comparing the eigenvalues of different 
operators [cf. IV; 188). 

Notice also a direct consequence of (118) [cf. IV; 26]. The necessary 
and sufficient condition for a completely continuous self-conjugate 
operator A to be positive ((Az, x) > 0 for x € H), is that it has no 
negative eigenvalues. 

We shall now prove that a completely continuous self-conjugate 
operator is fully defined by the nature of its spectrum, which is 
described in Theorems 1 and 2. 

THEOREM 5. Let a linear self-conjugate operator have the following 
properties: the orthonormal system of its eigenelements x, (k =1, 2, ...) 
is complete, all the non-zero eigen values 2, have finite rank, and only 
a finite number of eigenvalues can lie outside any interval [—e, +€], 
where € > 0. The operator A is now completely continuous. 

By hypothesis, we can arrange the 4, in order of non-increasing 
absolute value: 


lay] > [Ag] > Aal >... (124) 


137] UNITARY OPERATORS 415 


and J, —> 0 as n—> œ. Remember that, if an eigenvalue has rank r, 
it figures r times in sequence (124) (the eigenvalue A = 0 can have 
infinite rank). 

Since the system of 2, is complete, we have the Fourier expansion 
for any element x € H: 


r= X apiy (125) 
k=l 
and 
Ag = Sa, An Le (126) 
k=l 


Let U be a bounded set of elements v, i.e. there exists a positive 
number J such that 


Sae, (127) 
k=l 
if œ € U. We have to show that the set Az is compact. It is bounded 
by virtue of || Az || < nal. It remains to show [92] that, if (127) is 
satisfied, given any € > 0, there exists a positive m, such that 


oo 


> lad? hk < E. 


k=mg 
Since 2, > 0 as n— ©, there exists an n, such that | ån] < e/l 
for n > n,. Now, 


k ie 
> lad <-i > lal < 
K=ng k 


=ng 


2 


E? 
rp =g, 





and the theorem is proved. 


137. Unitary operators. Along with self-conjugate operators, we 
must consider a further class of linear operators. 
DEFINITION. A linear operator 


y = Ux (128) 


is said to be unitary if it does not change the norm of an element, i.e. 
|| Uz || = || x ||, and transforms H into the whole of H, i.e. given any 
y € H, there exists a pre-image x, i.e. an element x such that (128) holds. 

Notice that, by the definition, the norm of a unitary operator is 
equal to unity. The basic properties of unitary operators are given 
in the next theorem. 


416 HILBERT SPACE [137 


THEOREM 1. A unitary operator transforms H one-to-one into itself, 
has a bounded inverse, defined by 


U~ = U*, (129) 
UU*=U*U=E, (130) 


where U-) is also a unitary operator, and does not change a scalar 
product. Condition (130) is sufficient for U to be unitary. 

If x, and x, are two elements of H, by definition of unitary operator, 
we have || Uz, — Ux, || = || U(x, — z) || = || £1 — 2, ||, so that, if 
Ux, = Ux, then x, = £, i.e. given different elements x, (128) gives 
different y, ie. U defines a one-to-one transformation of H into 
itself. Hence there exists a bounded inverse U-1, defined throughout 
H, where, since U does not change a norm, we have || U-1y || = 
= || y |l, ie. U -+ is also a unitary operator. Since the norm is invari- 
able, we can write (Uz, Ux) = (a, x), whence it follows immediately 
that 

(U* Ux, x) = (x, £). 


But if two quadratic functionals are equal, the operators appear- 
ing in them must be equal, i.e. U*U = E, whence it follows that U* 
is the left-hand inverse of U, and, since a bounded inverse operator 
exists for U, we also have UU* = E, and (129) is proved. The 
assertion that U does not change the scalar product follows at once 
from 


(Ux, Uy) = (U* Uz, y) = (x, y). (131) 


Finally, let us show that (130) implies that U is unitary. By (130), 
U has a bounded inverse, defined by (129). It remains to show that 
U does not change a norm. By (130), this follows from (131) with 
y=. 

Notice also that, if U, and U, are two unitary operators, their product 
U, U, is also a unitary operator. This is an immediate consequence 
of the fact that, if U, and U, transform H one-to-one into H and 
do not change the norm, their product obviously has the same pro- 
perties. The inverse of a unitary operator is therefore unitary, and 
the product of unitary operators is unitary, i.e. unitary operators form 
a group. 

Let 

Uy, Xz, Lgr (132) 


be a closed orthonormal system. On applying the unitary trans- 


137] UNITARY OPERATORS 417 


formation U to it, we obtain, in view of the properties of U, the 
orthonormal system 


y, = Ux; y,=U%,; ys = Utz... (133) 
An element x has the expansion in elements of this system: 
L= Ay Ty + ag Lg + agy +... (134) 


so that the transformed element Ux has an expansion in elements 
of system (133) with the same coefficients: 


Ux = ay, + 2 Ya + A393 +- (135) 


The element Ux may be any element of H, so that system (133) 
is also closed. Conversely, if, given two closed orthonormal systems 
£k and Yy (k = 1, 2,...), we define an operator U for any element 2 
having form (134) by (135), this operator transforms H one-to-one 
into H without changing the norm: 


Ua]? = |j? = > Pl 


ie. U is unitary. Thus, every unitary operator can be defined with 
the aid of a transformation of elements of one closed ortho- 
normal system into the elements of another such system. 

Let A be a linear operator and y = Az. Let U be a unitary operator, 
and y’ == Uy, 2’ = Uz. Since y = Az, we can express y’ in terms 
of x’ in accordance with 


y =(UAU4)2’, (136) 


the operator B = U AU ~? being called the unitary equivalent of A. 
It follows from this formula that A = U-1 BU, whence it is clear 
that, if B is the unitary equivalent of A, A is the unitary equivalent 
of B. If P is the projector into subspace Lp, UPU -? is evidently 
the projector into the subspace obtained by applying U to sub- 
space Lp. If x, is an eigenelement of A corresponding to the eigen- 
value A, i.e. Avy = Ay £o we obviously have, on writing x74 = Uxo: 
(UAU~}) a, = A, 2%, ie. unitary equivalents have the same eigen- 
values, whilst the eigenelements are connected by the unitary trans- 
formation concerned. It can easily be shown, by using (129), that if 
A is a self-conjugate operator, B is also self-conjugate. 

THEOREM 2. The eigenvalues of a unitary operator have unit modulus, 
whilst the eigenelzments corresponding to different values are mutually 
orthogonal. 


418 HILBERT SPACE {138 


Let U be a unitary operator and van eigenelement of it, correspond- 
ing to the eigenvalue 4), i.e. U£, = A, £o. Since U does not change 
the norm, we can write 


(£o £o) = (U £o, U £0) = (Ag Xos Ag £o) = [žo]? (£o, o)» 


i.e. || zo|] = | Aq ||| £o ||, whence it follows, since || x || #0, that 
| 2o | = 1. Let x and 2, be eigenelements corresponding to distinct 
eigenvalues A, and å, ie. Ux) = fo% and Ux, = å z% Since U 
does not change the scalar product, we can write 


(Zo £1) = (U £o, Ux) = (Ag To, A, £1) = Ay Ax (Zo X1). 


Suppose (x, 2,) # 0; then it follows from this last equation that 
Ay a, = 1. But, by what has been proved, |4,|=1, so that 7, = 
== 1A) = Ag, ie. Ay = Ao, which is absurd, since A, and A, are distinct 
by hypothesis. Let us now introduce a further class of operatcrs. 

DEFINITION. A linear operator V is said to be isometric, if it does 
not change the norm of an element, i.e. || Vx || = || x || fora € H. 

Like every linear operator, V is defined in the whole of H, but 
it is not required that V transform H into the whole of H; in fact 
an isometric operator need not be unitary. To give an example, Jet 
x, (k = 1,2, ...) be a closed orthonormal system in H as above, 
so that every element v is expressible by its Fourier series (134). 
We define V by the formula 


Vi = Sa, Tk+ı: (137) 
k=l 
Obviously, V is a linear operator, and 
[Val = Jel? = > larl. 
k=l 


It follows from (137) that V maps H one-to-one onto the sub- 
space of elements orthogonal to z. 


138. The absolute norm of an operator. We now introduce a new concept 
in regard to the norm of a linear operator. Let A be a linear operator, and 
Zp Yk (k = 1, 2, ...) any two given closed orthonormal systems in H. We form 
the sum of non-negative terms: 


co 


2 R (Azp, Yq) (Yo Axp) F >. 3 (Atp Ygl? (138) 
P, 4= p. 4= 


138] THE ABSOLUTE NORM OF AN OPERATOR 419 


Let N(A; xp, y,) denote the positive value of the square root of this sum, 
It may be equal to (+). Let us show that it is independent of the choice of 
orthogonal systems zp and y,. Since (Axp, y,) are the Fourier coefficients of 
the element Ax, with respect to system y,, we can write instead of (138), by 
virtue of the closure equation: 


N? (A; Tp, Yq) = a \|Azp||. (139) 
p= 
On the other hand, since (Azp, Yg) = (£p, A* y,), wo obtain 
N? (A; ap, ya) = N*(A*; Ya tp) = X |1A* yall. (140) 
q= 


It follows from (139) that N?(4; 2p, yg) does not depend on the choice of 
system y,, and it follows from (140) that it does not depend on the choice of 
system 2p; thus it is natural for us to write N*(A) instead of N44; Lp, Yp): 
The positive number N(A) will be termed the absolute norm of the operator 
A. This norm may be equal to (+2). On taking (139) and (140) into account, 
and the independence of N(A) on the choice of systems xp and y,, we get 

(A) = N(A*). (141) 


Further, the equation 
N?(4 + B)= x 4z, + Bal? 
p= 


and inequality (107) of [59] yield at once: 


N(A + B) < N(A) + N(B). (142) 

Let U be a unitary operator. Now, U~12, forms a closed orthonormal 
system and || UAz|| = || Az ||. Hence it follows, by (139), that 

N(UAU7!) = N(A), (143) 


i.e. unitary equivalents have the same absolute norm. Let N(A) be finite and 
let z be a given normalized element. We can take it as the first element in an 
orthonormal system, in which case we have from (139): N*(A) > || Az ||?, i.e. 


|Az|] < N(A) for ||æ|| = 1, 


whence it follows that the ordinary norm of an operator is < its absolute 
norm. 

THEOREM. If the absolute norm of an operator A is finite, A is completely 
continuous; if, in addition, A is self-conjugate, we have 


N?(A) == X ìk (144). 


where A, are the eigenvalues of A (multiple values appear several times). 


420 HILBERT SPACE [188 


Let U be any given bounded set. We have to show that, if N(A) < +, 
the set Az, where x € U, is compact. By hypothesis, there exists a positive 
number J such that || x || < lif x € U. The set AU is evidently bounded, since 
|| Ax || < nal. On introducing some closed orthonormal system yx (k = 1, 2,...), 
we can transform H into l, and it remains for us to show that, given any e > 0, 
there exists a positive integer n, such that 


5 (4z, yp? < &. (145) 


k=ng 


We have 


oo oo 


S Ary = X (z, A* y <P X |A* yall’ 
k=ng kane k=ng 
But, since N(A) < +, series (140) is convergent and there exists an n, 
(independent of the choice of x € U) such that 


oo 


2 

X At ull? <> 
k =ng 

whence (145) follows, and the complete continuity of A is proved. If A is self- 
conjugate, we choose for yg the closed orthonormal system of its eigenelements, 
so that Ay, = Ay yp and || Ayp ||? = || A* yy ||? = Az. Now, by (140), we obtain 
(144), and the fact that N(A) is finite is equivalent to the convergence of the 
series on the right-hand side of (144). 

We shall prove later that, if A is a self-conjugate positive operator, i.e. 
(Az, x) > 0 for x € H, there exists a linear positive operator B such that 
B? = A. We usually write B = yA. We can use this operator to introduce 
the concept of the trace of a linear positive completely continuous self-conjugate 
operator: 


co 


N? (B) = X ||Ba,||? = X (Bap, Bry) = X (B° xp, £p) = X (Ap £p). 
p=1 p=1 p=1 p=l1 


It will be seen from this that, in the case of a self-conjugate positive operator, 
the sum 


> (Arp, Zp) 
p=1 


is independent of the choice of system xp. This sum is called the trace of operator 
A and is written symbolically as Sp(A). It follows from the above discussion 


that 
Sp(A) = NVA) = 2) (Aap, 2p). 
p= 


If A has a purely point spectrum and we take as x, the closed ortho- 
normal system of the eigenelements of A, we obtain, since Azp = Up Xp: 


Sp(A) = > Hp 
p=1 


139] OPERATIONS ON SUBSPACES 421 


139. Operations on subspaces. This section and the next will be 
devoted to operations on subspaces and the properties of projection 
operators. This material will be required later for the theory of self- 
conjugate operators. 


Let L, (k = 1, 2,...,m) be mutually orthogonal subspaces. We 
bring in the concept of their sum [cf. 122]: 
L=L Ol¢g Ou. Lm- (146) 
We shall write L for the set of elements x of the form 
B= T+ %+..-+ Tm (147). 


where x; € Lpy. Since the L, are orthogonal, we have x, = P, x (k = 
=1,2,...,m), and 
jele = [eal]? + læa? +--+ + erll? - 


It may easily be shown that L is a subspace. It is known as the 
orthogonal sum of subspaces Lp. Let us now consider an infinite sum 
of mutually orthogonal subspaces. 


LAT Oly Ole @ix. (148) 


Let L denote the set of elements x expressible as the sum of a 
convergent series 


T= AM+%4+%,4+---;, (149) 
where 2; € Le. The last equation is equivalent to [122]: 
jel = jelle + leal + [esl]? + -o (150). 


and when it is satisfied, x, = Pt, x. If x is any element of H, z= 
= P, Z and 8,(%7) = t + tt + £m, then 


le — sm (2) |? = llel? — 2 ale (151) 


The equation (150) is equivalent to || £ — Sm(x) || —> 0 as m— oo 
It may easily be seen that L is a lineal. Let us show that L is a sub- 
space. Let 2” € L and a=" as n-» œ. We have to show that 
xz € L also. We have the obvious inequality 


le — sm (|| < lle — aU] Lf — sp (E) + Ism 8 — 2) 
But s,(2 — x) is the projection of z™ — æ on to the subspace 


LOLO... © Lm, 80 that || Sma — a) |] < || 2 — z ||, and we 


can write 
lo — sm (|| < 2 lje — |] + |x — sm (a). (152) 


422 HILBERT SPACE [139 


Given e > 0, we can fix an n such that || x — 2 || < e/3. But 
| a — s(x) || < e/3 for all sufficiently large m, since a” EL, 
and it follows from (152) that || x — Sm(®) || < «, ie. x € L, and we 
have proved that L is a subspace. Given any y € H, the element Pz y 
is expressible as 


Piy =z t z+. (153) 


where zy = P,,(P,y). But, since the L, belong to L, we have Pi(PLy)= 
= P}, 4, ie. PL PL, = Prp and on passing to conjugate operators, 
P,, Pt = PL, whence zy = PLY, and (153) can be rewritten as 


, Piy = Puyt PLY t Puy t. (154) 
i.e. 


| ees EE E Y E ee (155) 


where the convergence of the series must be understood in the sense 
of the strong convergence of a sequence of operators. 

Notice that, if 2,, £% ... are mutually orthogonal and normalized 
elements, and it is assumed that each x, generates a one-dimensional 
subspace L, of elements az,, where a is any complex number, we 
have an orthogonal sum of these subspaces, which is formed by 
elements of the form 

D Ck Th 
k 


where the series of numbers | cp]? is convergent, and the projector 
into subspace L has the form 


PLY = SUX: 
k 


where a, £k = PLY and ak = (Y, £p). 

A subspace M is said to be part of subspace L (M c L) if all the 
elements of M belong to L. The difference between L and M: L O M, 
is defined as the set of elements of L orthogonal to M [122]. If we 
write L © M = M,, then L = M ® M,, and subspaces M and M, 
are complementary to each other with respect to L [122]. 

The product L, L, of two subspaces is the set of elements common 
to L and L,. It is easily shown that this set is a subspace. This defini- 
tion of product is applicable to any finite or infinite number of 


subspaces. 


140] PROJECTION OPERATORS 423 


140. Projection operators. We have seen that the projection 
operator P, into a subspace L is self-conjugate and has unit norm 
{excluding the case when P, is the annihilation operator [124]}). 
It follows at once from the definition that 


=P (156) 
so that 
(P, g, £) = (Pix, £) = (P, 2, PL2) = || P z|? > 0, 
i.e. P, is a positive operator. Let us prove some theorems on pro- 
jectors. 
THEOREM 1. If A is a self-conjugate operator, satisfying 


A? = A, (157) 


A is the projector P, into the subspace L formed by the elements y = Az 
when x runs over all H. 

The set L of elements y = Az, when z runs over H, is a lineal, 
since the operator A is distributive. Let us show that L is a subspace. 
Let Yn be a sequence of elements of L and yn=> y. We have to show 
that y € L. Since yn € L, we can say that there exist elements £n 
such that yn = Aln, or, by (157), Yn = A( Aqn), ie. Yn = Ayn, whence, 
by passing to the limit and using the continuity of operator A, we 
get y = Ay, so that y € L. It remains for us to show, in order to 
complete the proof, that the element (x — Az) is orthogonal to any 
element of L, i.e. is orthogonal to an element Az, where z is any 
element of H. We have 


(x — Ax, Az) = (x, Az) — (Az, Az). 


Since A is self-conjugate, we can change A over from the first 
element x to the second element z. We thus get 


(x — Ax, Az) = (x, Az) — (x, A? z) 


and it follows from (157) that the right-hand side is zero, i.e. (x — 
— Az, Az) = 0; the theorem is proved. 
Two projectors Pz; and Py are said to be mutually orthogonal if 


P,Py= 9, (158) 
where the symbol 0 on the right indicates the annihilation operator. 


On passing to conjugate operators in (158) and recalling that the 
projector is self-conjugate, we obtain, along with (158): 


424 HILBERT SPACE [140 


THEOREM 2. The necessary and sufficient condition for projectors P, 
and Py to be mutually orthogonal is that the subspaces L and M be 
mutually orthogonal. 

Necessity. If L and M were not mutually orthogonal, there would 
exist an element x, of M, not orthogonal to L. We should have 
Pyz = z for such an element, so that P,(Pyxv) = Pig # 0, which 
contradicts (158). Let us prove the sufficiency. If L 1 M, Pma is 
orthogonal to LZ for any element xv, so that P,(Pmz) = 0, i.e. (158) 
holds. 

THEOREM 3. The necessary and sufficient condition for the sum 
PL + Py to be a projector is that subspaces L and M be mutually 
orthogonal. If this condition is fulfilled, P + Pm is the projector into 
LEM. 

Necessity. Let P + Pm be a projector. We now have, by (156): 


(Pp + Py) (Pi + Pm = Pr + Pm (160) 


and, on removing the brackets and recalling that Pz = P, and 
Py = Pm, we get 


We multiply on the left by Pz: 
P,PytPi Py Py. =09. (162) 


On multiplying this equation on the right by PL, we get PLPmPL = 
= 0, which leads us, by (162), to P} Pm = 0, from which it follows, 
by Theorem 1, that Z and M are mutually orthogonal. Let us prove 
the sufficiency. If Z and M are mutually orthogonal, we have (161), 
by virtue of (158) and (159), and hence (Py, + Py)? = Pt + Pm, 
and by Theorem 1, P, + Py is a projector. The subspace correspond- 
ing to this projector is defined by 


y = (PL + Pmr = PL + Pye, (163) 


where x runs over H. Here, Px € L and Pms € M. Hence any 
element y, defined by (163), belongs to L ® M. Conversely, if we 
take any element u + v belonging to L © M, where u € Land v € M, 
substitution of z = u + v in (163) gives us y = u + v. Thus (163) 
defines the subspace L @ M, and the theorem is proved. 

The operator Py is said to be part of operator P; if 


Pi Pu = Pum (164) 


140] PROJECTION OPERATORS 425 


On passing to conjugate operators in this expression, we get 


THEOREM 4. The necessary and sufficient condition for Pm to be 
part of P, is that the subspace M be part of subspace L. This is equivalent 
to the condition that, for any x, 


[Pa || < [Px 2], (166) 
or what amounts to the same thing, 
Pm < Py. (167) 


If condition (164) is satisfied and we take an element x, belonging 
to M, we have Pm £o = £p and it follows from (164) that P, x) = £o, 
ie. 2) € L and M is part of L. Conversely, if M is part of L, gi- 
ven any 2, the element Py x belongs to M, and hence to L, so that 
Pi(Py £) = Py z, i.e. condition (164) is fulfilled. Now, by (165), we 
can write for any element v: || Py || = PaP £) || < || Pra ||, 
whence inequality (166) follows. Let us now show that, conversely, 
(166) implies that M is part of L. If this were not the case, an element 
z, would exist, belonging to M and not to L. We should have for 
this element: || Py zoll = || zoll and || Piz || < || zoll which 
contradicts (166). Finally, by (157), we can write (166) as (PLx, x) > 
> (Pye, x) or ((PL — Py) z, x) > 0, whence it follows that (167) is 
equivalent to (166), and the proof is complete. 

THEOREM 5. The necessary and sufficient condition for the difference 
P; — Py to be a projector is that M be part of L. If this condition is 
fulfilled, P — Pm is the projector into L O M. 

If P — Py is a projector, we must have 


(Pi — Pm) (Pi — Pu) = Pi — Pu (168) 
or, on removing the brackets, 
Pi Pm + PmPL=2Pm. (169) 


On multiplying by P,, first from the left, then from the right, 
we arrive at the two equations 


P, Py + PPP, =2P, Pu and P, PyP,+PyP,=2Pu P 


from which it follows that Pt Pm = Pm Pu, and by (169), we have 
Pi Py = Pm PL = Pm, i.e. condition (164) is fulfilled, and M is 
part of L. Conversely, if M is part of L, i.e. (164) and (165) are satis- 


426 HILBERT SPACE [140 


fied, (169) follows from them, and hence (168), so that, by Theorem 1, 
P, — Py isa projector. The subspace corresponding to it is defined by 


y= (P — Py z= Pt — Pyst, (170) 


where x runs over all H. The elements Pis and Pyz belong to L, 
since M is part of L by hypothesis. Formula (170) thus yields elements 
belonging to L. Let us show that, in addition, the elements y are 
orthogonal to M. Let z be any element of M. We have Pyz = z, 
and we can write 


(P£ — Pug, z) = (P£ — Pux, Puz) 


On transferring Pm from the right to the left and using condition 
(165), we obtain 


(PLx — Pus, z) = (Pms — Puz, z)=0, 


ie. in fact Pæ — Pyx 1. M. Hence (170) yields elements y belonging 
to L — M. If u is any element of LO M, i.e. u€ L and u | M, 
then y = Pu — Pyu = Pu = u, and we can therefore finally 
assert that (170) defines the subspace L © M, and the theorem is 
proved. 

THEOREM 6. The necessary and sufficient condition for the product 
P, Py to be a projector is that P, and Pm commute, i.e. 


If this condition is fulfilled, P Pm is a projector onto the sub- 
space LM. 

The necessity of (171) follows from the fact that (171) is necessary 
and sufficient for P, Pm to be self-conjugate. Let us now show that, 
given (171), the operator Pz Pm satisfies (157): 


(Pi Pm) (Pr Pm) = Pi Pu = PL Pm 


The first part of the theorem is therefore proved. If x is any element 
of H, the element 


y = (PL Py) t= Pi (Pm 2) = Pu (PL2), (172) 


obviously belongs both to L and M, i.e. belongs to LM. Conversely, 
if we take any element 2, of LM, (172) with z = z, gives us y = 2p. 
Thus (172) defines a subspace LM, and the proof is complete. 

THEOREM 7. The limit of a convergent sequence of projectors is a 
projector. 


140] PROJECTION OPERATORS 427 


We have P,— P, the P, being projectors and P a self-conjugate 
operator [131]. On passing to the limit in P? = P,, we get P?= P, 
from which it follows, by Theorem 1, that P is a projector. 

THEOREM 8. Every monotonic sequence of projectors has a limit. 

We first consider a non-decreasing sequence of projectors: 


P< Pee PSs: (173) 


To prove that sequence (173) has a limit, we have to show that 
P,« has a limit for any choice of v, i.e. given any positive e, there 
must exist an N such that 


|P,%— Pmzlļi <e forn>m> WN. (174) 
By (173) and Theorem 4, 
[Pall < Paal] < Pal] <---> 
where we have for any n: || Pæ || < Ilx ||. The non-decreasing 


sequence of non-negative numbers || P,x || therefore has a limit, and, 
given any € > 0, there exists an NV such that 


Pn? — ||Pm ||? < e? for n>m> NX, 
_ By (157), we can write this inequality as 
((Pa— Pm) z 2) <6? for n>m>N. 


Since Pm is part of Pa, Pa — Pm is a projector, and by (157), the 
last inequality leads to (174); the theorem is proved. Notice that, 
by Theorem 7, the limiting operator P of sequence (173) is a projector, 
and on passing to the limit as n > œ in the inequality ((P, — 
— Pm) x, £) > 0, we get ((P — Pm) z, £) > 0, i.e. P > Pm. It can be 
shown similarly that a decreasing sequence of projectors has a limit, 
and this limit is also a projector. 

THEOREM 9. If L, (k = 1, 2, ...) is adenumerable number of mutually 
orthogonal subspaces, the sum 


5P (175) 


k 
is a projector into the subspace 
L=L,@L,@... (176) 


This proposition is a direct consequence of what was said in [139]. 


428 HILBERT SPACE [141 


141. The resolution of the identity. The Stieltjes integral. The sub- 
sequent development of the theory of self-conjugate operators is based 
on a general expression for any self-conjugate operator. Before deducing 
this expression, an important new concept must be introduced. 

DEFINITION. A resolution of the identity is defined as a family of pro- 
jectors E, depending on a real parameter A and satisfying the following 
conditions: (1) a projector , does not decrease as A increases, i.e. ij 
u > À, then E, > gz; (2) there exist finite values A=a and å =b 
such that E, = 0 and E, = E; (8) the projector E, is continuous from 
the right with respect to the parameter A, i.e. 

lim %,=%,. (177) 
Ave 0 

Notice that, by Theorem 7 of [140], given any value 4’, Z, has 
a limit as A tends to 4’ both from the left and the right. These limits 
are projectors, which are naturally denoted by @y_, and @y49. 
By (177), we must have x+ =x. We shall say that &, is con- 
tinuous at the point A if Z, = F,_9. Condition (177) requires that 
the projector @, be continuous from the right at every point. This 
condition is only added so as to fix the value of @, at every point of 
discontinuity with respect to A. 

Let us note some properties of the resolution of the identity &,. 
Let m be the strict upper bound of all the 4 for which %, = 0, i.e. 


%,=0 for <m and 8, > 0 for A>m. (178) 


At the point 4 = m itself, the projector %, will differ from the 
zero operator if Z, has a jump at this point. Let M be the strict lower 
bound of the 4 for which &, = E. Since &, is continuous from the 
right, we must have m = EH; thus M is defined by the following 
conditions: 

@,<£ for à< M and 6,=E for 2> M. (179) 


If e, is any fixed positive number, we can say that the projector 
Z, varies from 0 to Æ as å varies in the interval [m — £ọ M]. Further, 
by Theorem 5, we can say that, when u > A, &,, — Z is a projector, 
and 

6,F,=6,6,=6,, (180) 
(u > A) 

If we let 4 tend to p from the left in the difference Z, — @,, it 
will be seen that %, — %,9 is a projector. It can similarly be shown 
that Z, — #,_, is a projector when v > u. The following notation 


141] THE RESOLUTION OF THE IDENTITY 429 


will often be used in future. Let A be an interval [a, $]. We write 
AF, =p Ea (181) 
If 4’ and A” are two intervals having no common interior points, 
we have by (180): 
A'$,-A°%,=0 (182) 
(A4’ and A” have no common interior points). 


Using Theorem 2 of [140], we can say that the last equation is 
equivalent to the following: given any elements x and y, we have 


A'Z xz | A" Z, y (x and y arbitrariy) (183) 

(A’ and A” have no common interior points). 
If A, is the common part of intervals 4’ and 4”, we have by (180): 
AE A Ey = A, @,. (184) 
We know how to add operators and pass to the limit in an operator 
sequence. This gives us the possibility of using the resolution of the 
identity #, to form a ‘Stieltjes integral” for any continuous function. 
Let f(A) be a given continuous function, which may be complex, 


in an interval [m — £» Mj, where ey is a fixed positive number. 
We subdivide the interval: 


m—a=A<d4 <4, <...<4,,<4,=M, (185) 


and form the ‘‘Riemann-Stieltjes sum” corresponding to this sub- 
division 6 of the interval: 


o= a fW) 4, 8, = fr) (Fu — Fu.) (186) 


where v is any value from the interval [A,-,, åk]. The sum os is a 
linear operator. Let ns denote the greatest of the differences 4, — 
— A,-,. The following fundamental theorem holds: 

THEOREM. Given any sequence of subdivisions 6, with the condition 
that ns — 0, the sequence of operators os, has a definite limit, in the 
sense of strong convergence of the operators. 

We must first prove two lemmas. 


Lemma 1. If a and a, (k = 1, 2, ..., n) are complex numbers and 
z = a, + % +... + an, the elements x, being mutually orthogonal, 
we have 

n 
laz — Š 0% || < òllzl], (187) 


where ô is the greatest of the numbers |a — ax |. 


430 HILBERT SPACE [141 


We can write 
n 


n 
ar — È a Ty = È (a — A) Th? 


k=l 


whence, by Pythagoras’ theorem, 


n n n 
ar — Xazil = X ja- afle <9 Xel. (188) 


Pythagoras’ theorem also gives us 
n 

lele = È llep, 
k=l 


and (188) now leads directly to (187); the lemma is proved. 
Lemma 2. If 6 is a subdivision (185) of the interval [m — £o M], 
and ô’ is some other subdivision: 


M—&=Ag<4<4c...<dyiy<4y=HM 
of the same interval, we have for any element x: 
ox — oy || < 2||zl], (189) 


where œ is the greatest oscillation of the function f(A) in the intervals 
[âk An] and [åk 2k], ie. w is the number such that 


Ifa) —f(B)| <w, (190) 


if a and B belong to the same interval [2k Ax] or to the same interval 
[Aa Ak] 

We form the product 66’ of the subdivisions. On passing from 
subdivision 6 to subdivision 6’, each sub-interval A; of is split into 
a finite number of sub-intervals J (s = 1, 2, ..., Ma). Each term 
f(%) kax of the sum 


Opt = È f(y) Ag Bye (191) 


is now replaced by the sum 
mk 


& fo) APE, 2, 


s= 


where rv is a value from the sub-interval 4%. Hence » and rf) 


141} THE RESOLUTION OF THE IDENTITY 431 


belong to the same sub-interval A; of subdivision ô, and we have, 
by (190): 
| f(%) — fH) |< @ (192) 
(s = 1, 2,..., ™,). 


We form the difference 
0,2 — Cy at= STK (v) 4 8,2 — fo) ADS, x}. 


By (183), the elements 4; &, £, A® Z, x are mutually orthogonal 
for different k, and we can write, using Pythagoras’ theorem: 


|| 052 — oy z ||? = È io A Bae S109) AOZ xjl. (193} 


We have, in addition, 4,%,2 = S 49 Zıx, and the elements 


s=1 


Ae Zıx (s = 1, 2, ..., Mg) are also orthogonal to each other. Using 
Lemma 1 and (192), we obtain 


\| f(r) Ay Bre — S10?) AP Es w || 48a || 
so that, by (193), 
|| o£ — oy £ ||? < <o S114, Zarje. (194) 


Since E m-e = 0 and Sy = E, we can write 
n 
r=2 AT, (195) 
k=l 


where the elements on the right are mutually orthogonal. Pythagoras’ 
theorem gives 


= È 42r], (196) 


and inequality (194) can be rewritten as 
los — ows] < ollel]. 
It can similarly be shown that 
|| oy % — oyr] < w || al], 
and the statement of the lemma follows at once from 


oaz — oy 2 || < liost — ow gil + |] oy % — ayy |]. 


432 HILBERT SPACE [141 


We now turn to the proof of the theorem. We have to show that, 
given any element z, the sequence of elements o; x has a limit, i.e. 
o% = y. Once this is proved, the limiting element y may easily 
be seen to be independent of the choice of sequence ôn. For, if ôn 
and ô, are two sequences of subdivisions, satisfying the condition 
indicated in the theorem, and if os >y and ø = y’, then the 
sequence of subdivisions 6,, ôi, 6), 63, ... also satisfies the condition 
of the theorem, so that the sequence of elements 05,2, 03,2, 05,2, 


Cots ... must also have a limit. It follows at once from this that 


y =y. 

Let us establish a preliminary inequality. 

The elements A, Z, x, appearing in the sum øs, are orthogonal to 
each other, and by Pythagoras’ theorem: 


lox]? = Slieok 





a ||? . (197) 


Further, the continuous function f(A) is bounded in modulus, i.e. 
| f(A) | < p, where p is a positive number. Formula (197) leads to 
the inequality 


loam? <p? 4% 2, (197,) 


from which it follows, by (196), that || osx || < p || æ ||, ie. the norm 
of the operator os does not exceed p for any subdivision. Let us 
now show that the sequence of elements oax has a limit for any 
choice of x. In view of the condition of the theorem regarding the 
uniform continuity of f(4) in [m — £p» M], given any € > 0, there 
exists an N such that 


| f(A) — fay] < 


if A’ and A” belong to the same sub-interval of subdivision 6, when 
n > N. On applying Lemma 2, we can say that 


||o,,% — Om || < 2e||a|| for n and m>N, 


i.e. the sequence øs% is mutually convergent, i.e. tends to a limiting 
element; the proof is complete. It is natural to use the ordinary 
notation for the Stieltjes integral to denote the limit of the operator 
sequences on indefinite subdivision ue the sense of a strong convergence 
of operators): 


tim S/o) ) Af = fja )d&,. (198) 


M— Eg 


142] THE SPECTRAL FUNCTION OF A SELF-CONJUGATE OPERATOR 433 


The following notation is used for the limiting element of the 
sequence of elements (191) on indefinite subdivision: 


n M 
lim > f(v,)4,%,2= f fad, x. (199) 
k=1 mMm— ëo 
It can similarly be shown that the corresponding integrals 


B 
f(a) dg, and § f(a) dF, a (200) 


PU 


exist, over any part [a, 8] of the interval [m — £», M]. Notice that 
the operator &, and element Z, x are constant outside [m — £o M); 
in view of this, the integral over the finite interval [m — £, M] is 
often written as an integral over an infinite interval: 


too 
i faa’, = J AAE p (a) dB, 2 = i (A) d8,x. (201) 

By separating out the jump (if there is one) Zm of the operator 
Z, at the point A = m, the above integrals can be reduced to integrals 
over the interval [m, Jf]: 


M 


aac )dF, = flim) En + | H(A) ab; 


(202) 
WA dg x = f(m) E m z+ ff) df, 2. 


m—e, 
An elementary bound exists for the integral, analogous to (197,): 
if | f(A) | < p, in the interval [a, 8], then 


B 
| § #4) ab, || <r, Fp— Fell. (203) 


We shall in future simply write m for the lower limit, instead of 
(m — £). The integral thus written, over the interval [m, M], will 
be equivalent to the integral over the original interval, with the 
addition. of f(m) Fm or f(m) E m x. 


142, The spectral function of a self-conjugate operator. If f(A) has 
real values, the operator c, is a linear combination of projectors 
with real coefficients, i.e. is a self-conjugate operator, and the limit 
of o, on indefinite subdivision is also a self-conjugate operator. 


434 HILBERT SPACE (142 


On putting f(A) = å, we get a self-conjugate operator A: 


M 
A= f 216, (204) 
or 


M 
Ax = { Ad& yz. (205) 


Formula (204) is fundamental to the entire theory of self-conjugate 
operators. We have arrived at it by starting from a resolution of 
the indentity 2,. For every &,, there is a corresponding self-conju- 
gate operator A, given by (204). The converse also holds. 

THEOREM. Given any self-conjugate operator A, there exists a resolu- 
tion of the identity Z, such that A is expressed by (204). 

The proof of this theorem is fairly complicated and, to avoid a 
break in the exposition, we shall postpone it till the end of the present 
section. We shall later prove a formula, in accordance with which 
&, can be defined for a given self-conjugate operator A. It will follow 
from this formula that different operators A correspond to different 
resolutions of the indentity. By the above theorem, (204) represents 
the general form of a bounded self-conjugate operator. If sum (191), 
with f(A) = å, is multiplied by an element y, followed by a passage 
to the limit, an expression is obtained for the scalar product (Az, y) 
as a Stieltjes integral: 

M 


(Ax, y) = f Ad(S,, x, y) - (206) 


It may be recalled that, if m differs from zero, the right-hand 
side is to be understood as the sum: 


M 
mmz y) + J 1d(F,2,y) , (207) 


where the last integral is an ordinary Stieltjes integral. We could 
have taken (206) as fundamental, instead of (204), since the operator 
A is completely defined by specifying a bilinear functional. Remember 
that the scalar product (#, x, y) is expressed linearly in terms of four 
scalar products of the form (g, z, z) = || Z; z ||? [125]. 

Since , > %, when u >A, || %,2||\? does not decrease as A 
increases, so that the (in general complex) function (8; 2, y) under 
the sign of the differential is a function of bounded variation in 4. 


142] THE SPECTRAL FUNCTION OF A SELF-CONJUGATE OPERATOR 435 


If we put y = x, we get an expression for a quadratic functional as a 


Stieltjes integral: 
M 


(Ax, x) = f| Ad(#,2, 2). (208) 
Here, the increasing function (@, x, x) = || Z, 2 ||? stands behind 


the differential sign. 

The family of projectors @, is usually known as the spectral function 
of the self-conjugate operator A defined by (204). Let us show that 
the numbers m and M defined above coincide with the bounds of 
the operator A that we defined in [126]. We write down the quadratic 
functional (Az, x) in the form 


M 
(de,2) =m||En zle + f AAZ ale. 


The function behind the differential sign is non-decreasing in A. 
On replacing A first by m, then by M, we arrive at the inequalities 


m || Fy x\|? < (Aw, x) < m|| Ema ||? + MI Emale — || Em2 |P], 
or, since m x = g, at the inequalities 
|||? < (Ax, 2) < M |æ |f. 


It remains to show that m and M are the strict bounds of (Az, x) 
when || æ || = 1. Let us show e.g. that M is the strict upper bound. 
The difference Em — E m-s = E — E m, where e is a given positive 
number, is a projector differing from the zero operator. Suppose 
that the normalized element x belongs to the subspace corresponding 
to this projector. Then (E — Em —) 2 = x, ie. Emx = 0, and all 
the more 2,2 = 0 for å < M — e. Hence, on replacing the factor 
A by (M — e) in (208) and taking || x || = 1, we can write 

(Ax, x) > (M = e) | (E Ep E m-a) z| = (M = £) d 
whence it follows, since e is arbitrary, that M is the strict upper 
bound of (Az, x) when || a [| = 1. 


Another formula may be deduced. We multiply both sides of the 
equation 


n n 
0 = È rr Ar By = È E h — Zna) (209) 


by &,, with the assumption that / is one of the points of subdivision åy. 
Now, by (180), we have + 4,8, = 4,%,+:&,=0 for A < A, and 


436 HILBERT SPACE {143 
i * A, &, = Ay By i i = Ak; for A> Ans so that 
Er 0 = %F, = S44, 2), 
AERA 
and a passage to the limit gives us 
4 
Z, 4 = AF, = fide, (A> m) (210) 


together with the analogous formula for a bilinear functional: 


A 
(g, Ax, y) = f idg, x, y). (211) 


143. Continuous functions of a self-conjugate operator. If A is a 
self-conjugate operator defined by (204), we can associate with any 
function f(A), continuous in the interval [m, M], an operator f(A), 
defined by 


M 
f(A) = J fla) ag, (212) 


This correspondence between a continuous function f(A) and a con- 
tinuous operator f(A) is distributive, i.e. the operator c; f,(A) + ¢of,(A) 
corresponds to the continuous function c, f,(A) + cz fa(å). This is an 
immediate consequence of the fact that integral (212) is distributive 
with respect to f(A). Moreover, the correspondence is multiplicative, 
i.e. the operator /,(A) f,(A) (or the operator equal to it, f,(A) f,(4)) 
corresponds to the function /,(A) f,(4). To prove this, we form the 
product of sums c for f,(A) and f,(A): 


= fal) A, &,: fal) A, @,, 


Using (182) and (184), we can write the above product as 


PAAIE Fr È hvi) 48, = È hilo) fak) 4, Fy, (213) 


and passage to the limit gives us 


M M M 
f f(A) d&,- J hà) dZ, = J hl) falh) AB, (214) 


143] CONTINUOUS FUNCTIONS OF A SELF-CONJUGATE OPERATOR 437 


which is what we set out to prove. Formulae analogous to (212) can 
be written for a bilinear and quadratic functional: 


(f(A) 2, y) = fra d(2,z, y); (f(A) x, £) = fra dlge]. (218) 
Further, we have the formula analogous to (211): 
f(A) =f (4) = fra )a&,. (216) 


On taking (214) into account, we get the following formula for 
positive integral powers of A: 


M 
A" = (Ande, (217) 
m 


(n = 1, 2,3,...) 
and for a polynomial: 


ag A” +a,A™14 ...+4,,4+4,= 
M 
= f (ag + a, Art + ... H Ani + an) dé. (218) 
m 


As mentioned above, if f(A) is a real function, the operator øs is 
self-conjugate, and the limit of cs, i.e. f(A), is also a self-conjugate 
operator. If f(A) > 0 in the interval [m, M], by (215), the operator 
f(A) is positive. Now suppose that f(A) is complex: f(A) = (å) + 
+ (A) i. We now have f(A) = g(A) + i (A), where (A) and (A) 
are self-conjugate operators. On forming the operator F(A) = (A) — 
— iy(A), we can use the fact that (A) and (A) are self-conjugate 


to write 
(f(A) a, y) = (z, F (A) y), 


i.e. the operator F(A) is the conjugate to f(A). 

Some commutation properties should be noticed. It follows from 
(210) that the operator &, commutes with A for any value of å. 
Hence the operator 4 2, = Zg — Za also commutes with A, for any 
values of a and £. Thus the sum ø, commutes with A, and we find 
on passing to the limit that f(A) commutes with A. Let us now prove 
a theorem. 

THEOREM 1. The operator f(A) commutes with any operator B that 
commutes with A. 


438 BILBERT SPACE [144 


Let ¢, be a sequence of positive numbers tending to zero. By Weier- 
strass’s theorem [I]; 154], there exists a sequence of polynomials 
Palà) such that 

If (4) — Pa (4) | < En (219) 
(m <å <M). 
We form the difference 


M 
f(A) — Pa (4) = f [f (A) — Pn (A)] dé). 


On taking (203) and (219) into account, we can write 
| f(A) — P, (A)] z|] < en llel, (220) 


whence P,(A)— f(A). An operator B that commutes with A also 
commutes with any polynomial P,(A), i. BP,(A) = P,,(A) B. 
On passing to the limit, we get Bf(A) = f(A) B, and the theorem is 
proved. We shall show later [161] that, given any A, the spectral 
function @, commutes with any operator B that commutes with A. 
Conversely, if B commutes with Z}, B commutes with any operator 
Ač, and hence commutes with sum (209) and, in the limit, with 
the operator A. Thus we have the following theorem. 

THEOREM 2. The necessary and sufficient condition for an operator 
to commute with A is that it commute with @, for any A. 

The following example of a function of an operator has already 
been utilized [138]. Let A be a positive operator, i.e. m > 0, and 
let f(A) = YA (A > 0), where the arithmetic value of the root is taken. 
We can define the positive operator VA: 


ya-= { ir 


or 
M pap 
(Az, y= J Vad (@, 2, y). 
By (214), we have VA VA = A. 


144, A formula for the resolvent and a characteristic of regular 
values of 4. 

The spectral function can be used to give a formula for the resolvent 
[130] and to indicate a new characteristic of regular values of A. 
We shall in future speak of the resolvent R, only with regular values 
of l. 


144] A FORMULA FOR THE RESOLVENT 439 


THEOREM 1. Zf l is a non-real number, or is real but outside the interval 
[m, M], the resolvent R, of the operator A is defined by 


M 
l 
R, = f 2 (221) 


Given the hypotheses, the function 1/(A — 1) is continuous in the 
interval [m — £p M] for sufficiently small ¢,. By using (214), we 
can write 


fri 7 dé, fa—nae, = 


M 


=[(—)aé,. ieee 118, = [oe = E: (222) 


but 
M 
f (2—1) d8, = A — lE, 
m 


and (222) leads directly to (221). 

THEOREM 2. If l belongs to the interval [m, M], but lies inside some 
interval [a, 8] in which Z, is constant, i.e. Zp = Ea, the resolvent Ri 
exists and is given by (221). 

We split [m — £p M]into three parts: [m— £% a], [a, 8] and [f, M]. 
The function 1/(4 — l) is continuous in [m — £ọ»a] and [f, M], 
whilst @, is constant in [a, 8], and all the 4, &, are annihilation 
operators for this last interval. We extend the function 1/(A — 1) 
from the extreme to the central interval [a, 8] in such a way that 
it is continuous throughout [m — £p, M]. Let y(A) denote the function 
thus formed. The value of the integral 


R, = f p (2) dB, (228) 
m 
is obviously independent of the values of g(A) in [a, £]. By using 
(214), we can write 


M M 
f odg, f (4— dF, = 


M M M 
= f (4—1) dg, f p (2) dg, = f (4 — D) p (2) d&,, 


440 HILBERT SPACE [144 


and, on observing that (å) = 1/(å — l) for A < a and A> £, and 
that Z, is constant in [a, 8], we get 


M a M 
f pA (A — idg, = f dZ, + fd, = Z, + (E — Zp) = E, 
m m 8 


whence it follows that (223) yields the resolvent. We can obviously 
take integral (221) instead of (223), the integration being carried out 
from m — £, to a and from # to M. It follows from the proof of the 
theorem that a value 4 = l from the interval [m, M] is regular pro- 
vided it can be covered by an interval in which &, is constant. We 
show in the next theorem that this condition is necessary, and not 
merely sufficient, for regularity. 

THEOREM 3. If, given the real value à = l, the resolvent Ry exists, 
l must lie inside an interval [a, B] in which E, is constant. 

Let [a, 8] be any interval containing l as an interior point, and 
Ag, =Z —&,. By definition of the resolvent, 4 E, £ = R{A — 
—lE)A@, x. But we can write, by (216), 


8 
(A — lE) A, = f (à — l) AB, 2, 


a 


so that 
B 


(a<l<f). 


Let us write N for the norm of the operator R: 
L 
[sE ell < NIS a hagr]. (224) 


We have |å — l| < — a in the interval [a, 8], and, by (203), 
inequality (224) leads to 


|42,,2|| < N (8 — a) |482|]. (225) 


The interval [a, 8], containing J as an interior point, is taken so 
small that N(8 — a) < 1. It now follows at once from (225) that 
|| 4%, || = 0, ie. Z; = Za, and S, is constant in [a, $]. By combin- 
ing Theorems 2 and 3, we get the corollary: 

COROLLARY. The necessary and sufficient condition for a real 4 to be 
regular is that 4 lie inside an interval in which E, is constant. 


145] EIGENVALUES AND EIGENELEMENTS 441 


It follows at once from this corollary that, if a real A is regular, 
all the real 4 sufficiently close to it are also regular, i.e. the regular 
values of å form an open set on the real axis; this gives us the further 
corollary: the points of the spectrum form a closed set. 

A bilinear functional for the operator R; has the formula 


M 
(Riz, y) =(45 d (3,2, y). (226) 


We can take (—œ, +œ) as the interval of integration in this 
integral [108], and (g; 2, y) is a function of bounded variation in A. 
On setting 4 = ø +- it and applying the inversion formula for a 
Cauchy-Stieltjes integral [30], we get the following expression for the 
spectral function in terms of the resolvent: 


1079) + Grote) = 


= lim = 5j ((R etti Re~i) x, y) do. (227) 


If Z, is continuous at the point 4, the left-hand side of (227) is 
equal to (Zıx, y). At points of discontinuity, (Z; æ, y) is defined 
from its continuity from the right. Notice that the resolvent R, is 
defined in terms of the operator A itself, i.e. it follows from (227) 
that, given a self-conjugate operator A, there is only one spectral 
function in terms of which it is expressed by (204). Notice also that, 
by Theorem 1 of [143], the operators Rı commute for different J. 


145. Eigenvalues and eigenelements. The spectral function can 
be used to give a very simple definition of the eigenvalues and eigen- 
elements of a self-conjugate operator. 

Tarorem. The necessary and sufficient condition for à = A, to be 
an eigenvalue of the self-conjugate operator A with spectral function 
&, is that E, have A, as a point of discontinuity, ie. Zn — F,,-) > 0. 
In this case €,, — &,,-, is the projector into the subspace M, of eigen- 
elements corresponding to the eigenvalue A,. 

Let M, be the subspace corresponding to the projector Z1, — a-o 
If A, is a point of continuity of Z}, M, consists of the zero element 
only. The proof amounts to proving the following two assertions: if 
£o € My, then (A — A, E) x, = 0, and conversely, if (A — A, E) x) = 
= 0, then z, € M,. Suppose first that x) € M, i.e. (Za, — Zao) To = 


442 HILBERT SPACE {145 


= £. Now, all the more, Z1, Zo = 2, so that Fj,» Zo = 0. Since &, 
does not decrease as A increases, we can say that Z, £o = x, for 
A> A, and &, 2, = 0 for A < Ay. We apply (205) to the element z,: 


M 
Ax, = | 4d, t0, (228) 
m 


and we can assume when forming the sums ø, that A, is a point of 
subdivision. By the foregoing, all the differences 4, 8,2, vanish with 
the exception of one, corresponding to the sub-interval with right- 
hand end Ag, i.e. 
Ax = lim » (Fi, — F1,-c) £o- 
a0 
where Ao — € < p < Ay. We obtain on passing to the limit: 


Ato = Aq (fr, — Eae) Xo; 


but zo € M, by hypothesis, so that the right-hand side is here equal 
to Ay Zo, i.e. Lo satisfies the equation (A — A, E) £a = 0. Now suppose 
conversely that x, satisfies this equation, and let us show that 7) € Mo. 
It follows from (A — A, Æ) x) = 0 that 


((A — Ay E)? xo, £0) = 0, (229) 


or, on expressing the bilinear functional in terms of the Stieltjes 
integral: 

M 

f A— Ap) d || Fa ||? = 0. (230) 

m 

The integrable function (4 — A,)? is non-negative, and the function 

behind the differential sign is a non-decreasing function of å. It follows 
from this that all the elements of integral (230) are non-negative, 
and the magnitude of this integral over any part of the interval of 
integration must also be zero. Given e > 0, we can write 


M 
f (A— Ag)? d || F, x9 ||? = 0. (231) 
åte 
The integrable function (A — A,)? is > e* in the interval of integra- 
tion, and it follows all the more from (231) that 


M 
e f d||%,29 |? =0, 
Ate 


i.e. e [||zo ll? — || Zate zo|] =0. 


146] PURELY POINT SPECTRA 443 


Hence it follows, since e is arbitrary, that Z; £a = 2, for A > Ay. 
It can similarly be shown that Z, 2, = 0 for A < åo It follows at once 
from this that x, = lim (@1,4.— a-e) Zo = (Za — F1,-0)%o, and 

e+0 


the theorem is therefore proved. 

If the operator A has eigenvalues, on introducing a closed ortho- 
normal system into each subspace of eigenelements corresponding 
to a fixed eigenvalue, we arrive at an orthonormal system of eigen- 
elements of the operator A [128]: 


Wy Bay Cas ks (232) 


and at a sequence of corresponding eigenvalues: 


My Bay Hg, os (233) 


If r is the rank of an eigenvalue, this latter figures r times in 
sequence (233). The number r may in fact be infinite. 

On writing A; (k = 1, 2,...) for the points at which @, is discon- 
tinuous, and Lp for the corresponding subspaces of eigenelements, 
we can write 


Pi, =a — Eno. (234) 

We form the orthogonal sum of subspaces Ly: 
H=L,0L,01L,@... (235) 
As we know, the projection operator into subspace H” is given by 
P= Pp PSPs: (236) 


H’ is the subspace consisting of elements x that are expressible 
in terms of elements of the orthonormal system (232)with the aid 
of the convergent series 


T= Qi Ti + ay Ty + 43% +... (237) 


146. Purely point spectra. A self-conjugate operator A is said to 
have a purely point spectrum if the orthonormal system (232) is 
closed in space H (which is separable) [cf. 128]. This is equivalent 
to the fact that the subspace H’, defined by (235), is the same as H, 
or that the projector Py. defined by (236) is the identity trans- 
formation, i.e. 


E = Pu: (238) 


On multiplying both sides of this formula by &, and recalling that 


444 HILBERT SPACE [147 


(Za — n-o) 1 =0 for A< A, and is equal to Z, — Zao ford > åp 
we obtain an expression for &, in terms of the jumps of this projector: 
é, = >Pu= > (Fx — Fy~0) . (239) 
ASA MRSA 

In the present case, any element x is expressible by series (237), 
the a, being the Fourier coefficients of x with respect to system 
(232). On applying the operator A to both sides of (237) and recalling 

that Ax, = Uk £p, we obtain 


AT = Sa, lists. (240) 
sS 


On forming the scalar product with y and writing 6, for the Fourier 
coefficients of the element y, i.e. 


b; = (y, Ty) ; bs = (£s y) , 


we obtain the expression for a bilinear functional: 
(Az, Y) = X 4,4, 6,. (241) 
s 


When y = 2, we get the formula for a quadratic functional: 


(An 2) = Sala (242) 


which is entirely analogous to the expression for a quadratic form 
(Hermitian form) as a sum of squares. Thus, in the case of a purely 
point spectrum, very simple expressions are obtained for the operator 
A itself, and for the bilinear and quadratic functionals, with the 
aid of the orthonormal system (232). Let us now turn to the 
so-called purely continuous spectrum. 


147. A continuous simple spectrum. A self-conjugate operator A is 
said to have a purely continuous spectrum if the spectral function 
@, is continuous for all values of 4. Our problem will be to obtain 
formulae for the case of a purely continuous spectrum analogous to 
the formulae of the previous section. As a preliminary, a new concept 
must be introduced. 

Let e be a set of elements of H, and 2, 2%, ... 2, any given elements 
of H belonging to e. We form the linear combination c; 4, + Cz % + 
+ ... + ¢,2%,, with arbitrary coefficients cą. The set of elements of 
H which can thus be written as finite linear combinations of elements 
of e is clearly a lineal, say L. Let us introduce our new concept. 


147] A CONTINUOUS SIMPLE SPECTRUM 445 


DEFINITION. The closed linear envelope of the set e of elements of H 
is called the closure of the lineal L. 

The closed linear envelope is a subspace, and the characteristic of 
elements x belonging to it is as follows: given any € > 0, there exists 
a finite set of elements 2,, 2, ..., 2%, belonging to e, and there exist 
numbers cp, such that 


|e — (c£ + 6,4, +... + ¢,%,) || < €- 


In particular, all finite linear combinations of elements of e obviously 
belong to our subspace. 

Let Z, be the spectral function of an operator with a purely con- 
tinuous spectrum. Notice that 2, = 0 in this case. We take a non-zero 
element x and form the set of elements 


Zz, (243) 


where A runs through all values from m to M. Let C, denote the closed 
linear envelope of elements (243). We can form a continuous function 
of 4 corresponding to any given element y of H: 


Py (A) = (Y, Fx). (244) 


This function is obviously distributive with respect to the sub- 
script y, i.e. 
Pay+bz (A) = apy (A) + by, (A) : 


Let us also form the following two continuous functions of A: 
e (2) = (F,x, 2) = || Zra |2; hy (4) = (Fy, y) = || Zw l. (245) 


As we know, these do not decrease as À increases. If 4 is any interval 
[a, 8], we can introduce the usual notation for any function f(A): 


Af (A) = f (B) — f (a). (246) 
We have, for instance, 
Ag (A) = (48x, x) = ((E p — Za) T, x), 
i.e. 
de (A) = |) AB, (241) 


and similarly, 
Ah, (A) = || 4 y ||. (248) 


446 HILBERT SPACE [147 


We have for the function 9,(A): 
Ag, (A) = (y, 48,2) = (y. (A@,)? x) = (4EY, AET) , 


so that 
Í | Ag, (A) 2 < || 48,2 |2 - || 42y ||, 
1.68. 

| Apy (A) |? < Ae (A) Ahy (A). 


It is clear from this that the integral exists [81]: 
f ap Ady D ligy on 
de (2) 3 de (A) 


It will be shown below that, if y € Cx, this integral is equal to 
lly ||?. We split the interval [m, M] into sub-intervals 4, (k = 1, 
2, ..., n) and form the following elements of space H: 


Ap,x 
Vde (4) R 
If the function ọ(å) is constant in the interval 4p, then 4 E1 £ = 0, 
and the corresponding expression (250) is meaningless. We agree to 
throw such meaningless terms out of future formulae. By (183) and 
(247), the remaining elements (250) are mutually orthogonal and norm- 
alized. The Fourier coefficients of the element y with respect to 
system (250) have the form 


(Ys AKEaa) — AP (y) , 
VAye (A) VAr (A) 


The square of the norm of the difference between y and its Fourier 
series is given by the familiar formula [121]: 


n Apy (A) n, | Aggy (A) P 
A,€,2 ||? =|] yl? — *—_ SECO) 
~ Ten Hl = 42 
which leads us to Bessel’s inequality: 
| Ay@y (A) |? 
Seay <ly (252) 
and in the limit: 
M 
dø, (A) |? 
e < lyi (253) 


3 dg (A) 


147] A CONTINUOUS SIMPLE SPECTRUM 447 


THEOREM 1. If y € Cx, we have 


M 
| dø (A) |? 
Qi pee 2 
ily li de (A) (254) 
m 
If y € Ox, in view of the fact that C, is the closed linear envelope 
of @,2, given any € > 0, there exist a finite set of elements & 4,0 


(s = 1, 2, ..., p) of &, z and numbers c; such that 
p 
Yy = Sont +2 and |{z|| <e. (255) 
s=l 
We take the points 4, (s = 1, 2, ..., p) as the points of subdivision 


of the interval [m, M], adding the points A, = m and 4,4, = M, if 
these are not included among the ås, and we write 4; for the sub- 
intervals [As_.,4;] (s = 1, 2, ..., p + 1) thus obtained. We have 
Za =0 and & tpn =E, and on introducing the usual notation 


4,6, = 6), — 8,1: 
Zn = 48, ; Zr = AjF, + Ae, ; Gy, = AF, + 428, + AZ, ; « 


The linear combination of &,, 2 appearing in (255) can thus be 
written as a linear combination of 4; &, x, and (255) can be rewritten 
as 


p+! 
y= SFA w +z and |jz||<e, 
s=1 
where 6, are new coefficients. In other words, we have 
p+1 
lly — > b42] < e. (255) 
s=1 


This inequality is all the more preserved if the sum above is replaced 
by the Fourier series of the element y with respect to the orthonormal 
system [121]: 

Asé,x 
V Ase (å) 
(s =1, 2,...,p +1). 

Now, by (251), inequality (255,) becomes 


Pel | Asgy (4) |? 
Ivl- £ “tea S 
i.e. 
P+! | Aspy (4) |? 


= Ae (A) > [ly ||? — e- 
S= 


448 HILBERT SPACE [147 


On comparing this inequality with (253) and recalling that e is 
arbitrary, we see that the strict upper bound of the sums appearing 
in inequality (252) is equal to || y ||*, i.e. (254) holds. By using this 
formula and the formula 


(y.2) "4 ly tal +S lly + ele — pole lele, 


a more general formula is obtained for y and z belonging to C,: 


M ae 
dgy (A) de, (A) 
(ys 2) = J aa (256) 
where 
p: (A) = (2, 2,2). (257) 
In order to deduce similar formulae for a bilinear functional, we 
prove the following theorem. 
THEOREM 2. If y € Cx, then E, y and Ay also belong to Ox. 
Since y € Ox, either y is a finite linear combination of elements 


Z, Tt: 
y= Sogns, (258) 


s=! 


or y is the limit of such linear combinations. In the former case: 
p 
Fy = S obban. 
s=1 


But, by (180), Z, Z}, = Z, for å < å, and 6,8, = Ea for À > A,, 
ie. fy is also a finite linear combination of elements of Ox and &, y € 
€ Ox. If y is a limit of finite linear combinations of elements of Cy: 


f Pn 
y = lim DPE, 
n= s=1 


then 
Pa 
Fy = lim SMF, Fx , 
s=1 


nco 
i.e. Fy is also a limit of finite linear combinations of C,, so that in 
this case also 2, y € Cx. By (205), the element Ay is a limit of finite 
linear combinations of &, y. Any &, y € Cx by what has been proved, 
so that every finite linear combination of &, y and the limit of such 
linear combinations also belong to Cx, i.e. Ay € Cy, and the theorem 
is proved. 


147] A CONTINUOUS SIMPLE SPEOTRUM 449 


We can therefore write (256) with y replaced by Ay. 
Now, 9,(4) is replaced by the function 


M M 
PA, (A) = (Ay, g1) = jude, ? $2) = Sud, (y, E 8T) , 
m m 
and we obtain, on taking (180) into account: 
A 
pa, (4) = f u dup, (x), 


and (256) gives us 





À Ke 
d ( f uday w) dp (A) 
do (å) : 





(Ay, z) = 


aM: 


or, on recalling the property of Hellinger integrals [83], we have the 
formula 
= fa _Apy (4) dy, (4) (A) de, dg, (A) 


(Ay, 2) dea 


(259) 

Further, by (254), the expression on the right-hand side of (251) 
tends to zero as the sub-intervals 4, become indefinitely smaller, 
so that 


> Ay (A) 


kar yg (A) Agy. 


The terms of this last sum are elements of H, and the limit of this 
sum may naturally be written as a Hellinger integral, in the same way 
as previously for as sums: 


_ T dp) 
Y=] dea) 


m 


dgx. (y €C) (260) 


If we apply this formula to the element Ay instead of y, we obtain 
by similar arguments: 


dy, (å) 
Ay = pew Spay oat (261) 
m 


or, as the limit of a sum: 
Ay a 


n 
DA a Apa A, Gx => Ay. (262) 
k=l 


450 HILBERT SPACE [148 


Notice that we can use similar sums and passage to the limit in 
H to obtain a general definition of the Hellinger integral for elements 
of H. Now, instead of applying (256) and (260) to the element Ay, 
let us apply these formulae to the element &, y, which also belongs 
to Cx, u being any fixed number of the interval [m, M]. We have 


(y, E£) for <u 


< 
À = E , g. — 3 g g = = 
Pepy (4) = (E Ys E12) = (Y, E pE 1T) | (y, Zæ) for à> p, 


i.e. 
py (2) for å< u, 
py (u) for A> p, 


and the above-mentioned formulae at once give us 


Pepy (A) = | 


u 





doy (A) de, (å) 
(Fy 2) = |e a (263) 
* dp, (A) 
= | , 264 
E y l) La dfx (264) 


Notice that (256) is equivalent to the generalized closure equation, 
whilst (259) and (261) are equivalent to (241) and (240) of the previous 
section. The formulae of the present section have been deduced on 
the assumption that y and z € Cx. A self-conjugate operator A is 
said to have a simple continuous spectrum if there exists an element 
x of H such that Ox is the same as H. If this is the case, and we take 
as x the element just mentioned, our formulae hold for any elements 
y and z of H. 


148. Invariant subspaces. Before investigating anon-simple spectrum 
and a mixed spectrum, i.e. the case when the eigenelements exist 
but do not form a closed system, we must introduce a new concept 
and prove certain facts. 

DEFINITION. A subspace L is said to be invariant under the operator 
A when the following condition is satisfied: if x € L, then also Ax € L. 
Alternatively, L is said to reduce A. 

The meaning of the definition is as follows. If Z reduces A, A can 
be regarded as an operator separately defined in L, where L may 
be finite-dimensional or may be taken as a Hilbert space. In other 
words, an operator A, defined in the whole of H, induces an operator 
defined in L, which coincides with A for elements of L. The considera- 


148] INVARIANT SUBSPACES 451 


tion of A on separate invariant subspaces simplifies investigation of 
it. Notice that, if A is a self-conjugate operator in H, it is obviously 
a self-conjugate operator in any invariant subspace for it. Our future 
investigations will only cover subspaces invariant under self-conjugate 
operators. 

THEOREM 1. If the subspace L reduces the self-conjugate operator A, 
the complementary subspace H © L also reduces A. The necessary and 
sufficient condition for L to reduce the self-conjugate operator A is that 
the projector P; commutes with A, i.e. 


If L reduces A, Az € L when z€ L. We have to show that, if 
x | L, then Ax | L. Let z be any element of L. Then Az € L, and 


we have 
(Az, z) = (a, Az) =0, 


and our assertion is proved. Let us turn to the proof of condition 
(265). We write down the obvious equation 


Ax = APs + A (E —P,)z. 


If L reduces A, then A(P, <x) € L, and by what has just been 
proved, A[(E — P,) x] € HOL, so that the first term on the right- 
hand side is the projection of Az into L, i.e. P Ax = AP, x for any 
x, and the necessity of (265) is proved. Conversely, let (265) be satis- 
fied, and x € L. Now, As = A(P, x) = P,(Ax), ie. Av € L, and 
the proof is complete. By using Theorem 2 of [143], the following 
immediate corollary is obtained for our present theorem: 

COROLLARY. The necessary and sufficient condition for the subspace 
L to reduce A is that it reduce @, for any 2. 

THEOREM 2. If the mutually orthogonal subspaces Ly (k = 1, 2, ...) 
reduce A, their orthogonal sum 


L=L@L,OL1;,0..-. 
also reduces A. 
By hypothesis, A commutes with all the PŁ, and hence commutes 
with their sum: 
P =P + Pi + Pit ess 


which proves the theorem. Theorem 2 also holds when A is not self- 
conjugate. 

Some simple facts must be mentioned in connection with the concept 
of subspace invariant under a self-conjugate operator. If the projector 


452 HILBERT SPACE [148 


Py commutes with the projector P}, then L reduces Pm, and the 
operator Pm induces into ZL the projector Pzm in the subspace 
LM. Further, let L reduce A, and hence reduce its spectral function 
&,. Let AY and# denote the operators which are induced into L by the 
operators A and &,. It may easily be shown that &{ is a resolution 
of the identity for A”. If x € Lin (205), we can replace A by A” and @, 
by Z{®, and & is the spectral function of the operator A® defined 
in L. By (228), L also reduces the resolvent R, of operator A, where 
Rı induces into L the resolvent of operator AM, Let AY, AP, FY 
and &?) be operators induced by the self-conjugate operators A and 
@, into the invariant subspace L and the complementary subspace 
H O L, whilst v =x, + 2, and y = y, + y, are decompositions of x 
and y onto L and H OL. We have the obvious equations: 


Ax =Az,+ ABr; Fn = Fx, +EP; | 


(266) 
(Ax, y) = (AMx,, y) + (Ax, y). 


Similar formulae hold when H is decomposed into a finite or de- 
numerable number of mutually orthogonal subspaces reducing A. The 
whole of space H, and the zero subspace, i.e. the subspace that only 
contains the zero element, are trivial invariant subspaces for any 
operator. If an operator has no other invariant subspaces, it is de- 
scribed as an irreducible operator. Every subspace of eigenelements 
corresponding to a given eigenvalue A, of an operator A is a subspace 
invariant under A, and in this subspace the operator A reduces to multi- 
plication by the number A). If x, is any given eigenelement correspond- 
ing to the eigenvalue å, the set of elements of the form az), where 
a is any complex number, is also a subspace that reduces A. If Ly 
(k = 1, 2, ...) are all the subspaces of eigenelements of a self-con- 
jugate operator A, their orthogonal sum H’ reduces A. Let A’ and 
Zi be the operators induced into H’ by the operators A and @). 
We have, by (234) and (236): 


A't =D Algan — Enot, | 
k 


(267) 
ĝt = PAA (Fu — F,~0) v= > (En — F ,-0) x , l 
k ai 
i.e. 7; amounts to the sum of the jumps of the function &, at the points 
A, satisfying A, < A. The operator &7, induced by @, into the subspace 
H” complementary to H’, is a projector into the subspace H"M,, 
where M, is the subspace corresponding to the projector 3,. Any 


149] THE GENERAL CASE OF A CONTINUOUS SPECTRUM 453 


element of H” is orthogonal to all the Lr, ie. (Za, — n-o) x = 0 
if x € H", and, given any x belonging to H”, we can write Z7 as the 
difference 
G=8,— S(Fa—Fxr-0), 
aA 

so that 27 is continuous for all 4. Therefore, if A does not have a 
purely point spectrum, the subspace H” contains non-zero elements, 
the spectral function is continuous in it, and the operator has no 
eigenvalues at all in H”. The eigenelements in H’ form a closed system, 
and the operator A has a purely point spectrum in H’. 


149, The general case of a continuous spectrum. We have seen 
that, if an element y belongs to the subspace C, formed in [147], 
Ay € Ox also, i.e. O, reduces A. If the operator A does not have a 
point spectrum, i.e. its spectral function 2, is continuous for all 
values of 4, and its continuous spectrum is not simple, the subspace 
H is expressible, as we shall show, as an orthogonal sum of subspaces 
of type C,. In each of these subspaces the operator induced by A 
will have a simple continuous spectrum, and the formulae of [147} 
hold for elements belonging to such a subspace. The corresponding 
formulae for any elements of H will be obtained with the aid of a 
decomposition of the element over the above-mentioned subspaces, 
and, by (266), the formulae are arrived at by adding the corresponding 
formulae for the individual subspaces. The expression for H as an 
orthogonal sum of subspaces of type C, is deduced on the assumption 
that space H is separable. Let us take any closed orthonormal 
system 


Uy, Ug, Ug. 


We form Cy, by putting y, = u. The element u, can be written in 
the form u, = v, + Yz where v, € Cy, and ys L Cy. If Y2 #0, Cy, is 
obtained. Let us show that Cy, L Cy. By hypothesis, we have 
(Yo F, y;) = 0 for any A, since Z, y, € Cy. We have further: (2, Yz 
ZA) = (Yo, Ea Fi Y1) = 0. Hence it follows that any linear combina- 
tion of elements 2%, y, is orthogonal to any linear combination of 
@, Yp and in the limit, any element of C,, is othogonal to any element 
of Cy,. We next take the element u, and write it in the form u; = 
= 03 + Yz where v, € Cy D Cy, and ys L Cy, ® Cy, and form Cy. 
It may be shown as above that Cy, L Cy, and Cy, L Cy, and so on. 
We obtain in this way a finite or denumerable number of mutually 


454 HILBERT SPACE [149 


orthogonal subspaces Cy,, and since each element x of H can be 
expanded in elements wu, that form a closed system, the orthogonal 
sum of subspaces Cy, must give the whole of H: 


H=0,,00,,00,,0... (268) 


The formulae of [147] hold in each of the subspaces Cy, and we 
can therefore write for any elements y and z of H: 


“au, En Ux) A (En Yr 2) 


=2 = de, (A) ? 209) 


4 





a (ys En Yk) d (Er Yi 2) 
(Fy, 2) = 2 | do, Ü) , (270) 
m 
f(y Exp) 4 (ExYm 2) 
Ën Yk Ak» 
(Ay, NS \h= aga (271) 
m 
M M 
d (y, a yg) ea = 
y= Z| -g ams Av= Dl AG Pe (272) 
m m 
d (y, ë = 
Fy = zis a — dF Yr (273) 
where o;(A2) = || Z1 Yg ||?, and the sums written may be either finite 


or infinite. In the latter case, the convergence of the series in (272) 
and (273), containing elements of H, has to be understood as a con- 
vergence of elements of H. 

The above method of forming the subspaces Cy can be written in 
terms of the formulae that follow. If v is any element of H, its pro- 
jection onto Cy, is defined by the corresponding terms of formulae 


(272), ie 
M 
d (v, 6, yx) 
w= j dee Mate 


and we obtain for g, vg, by taking (180) into account as usual, 


ra (v, ë moe oe Sk Yn) 


ae degu) 


dZ „y, (274) 


150] THE CASE OF A MIXED SPECTRUM 455 


This remark leads directly to the following formulae, which cor- 
respond to the process described above for forming the subspaces C),: 


k— 
Er Yr = Ep Uys Er Yk = E Uk — > (a ave dé y,- (275) 
s=1 m d(H) 

Different subspaces Cy, are in general obtained with a different 
choice of initial system ug. In fact the number of these subspaces 
may turn out to be different. It is useful to extend these subspaces 
as far as possible, right from the start. 

It can be shown that a construction of Cy, is possible, such that the 
following condition is fulfilled: every set of measure zero with respect 
to o,(A) is a set of measure zero with respect to 0p4,(A), @p42(A), 
also. In view of the results of [74], this condition is equivalent to 
the following: every o@,(A) is expressible in terms of the preceding 
ex(A) by 

a 


Oy (A) = f o® (A) de, (A) 


m 


(k= 1,2,...,p—1), 


where the integral has to be understood as a Lebesgue-Stieltjes, and 
the py (A) are non-negative functions, measurable with respect to 
ok(å) and summable. When this condition is fulfilled, we shall say 
that the subdivision of the spectral function is normal. It can be 
shown that the number of subspaces is the same in different normal 
subdivisions. We shall return later to this question. 


150. The case of a mixed spectrum. As already mentioned, a self- 
conjugate operator is said to have a mixed spectrum if eigenelements 
of the operator exist, but the orthonormal system of eigen- 


elements 
Ly, Ly, Ly,.-- (276) 


is not closed in H. As above, let L, be the subspaces of eigenelements 
corresponding to the eigenvalue å. As we saw in [148], in the invariant 
subspace 


H'=L,@L,@01,@... 


the operator has a purely point spectrum with eigenvalues A, and 
subspaces of eigenelements Ly. Elements (276) here form a closed 
system in H’, consisting of eigenelements of A. 


456 HILBERT SPACE [151 


The operator A has only a purely continuous spectrum in the com- 
plementary subspace H. By making use of the formulae of [149] 
and [146], the following general formulae are obtained, in which the 
sums are due to the point spectrum in H’, and the integrals due to 
the continuous spectrum in H”: 


Ma > En Un) d (En Yp 
= Sabet SS A ata aes mee, (277) 
Mod (Y, Ex Ya) d (Ex Yio 
(Ay, 2) = Znad (ee (278) 
M d (y, & 
y = att SS aoa aw dS, Ye» (279) 
M diy, ë 
Ay= 3 mate + S| 2 A ww dB, Yis (280) 
km ek (A 


where a, and bx are the Fourier coefficients of elements y and z with 
respect to system (276) and the up are the eigenvalues of A, cor- 
responding to the eigenelements £p. 

By using the above subdivision of the operator into an operator 
with a purely point spectrum in H’ and one with a continuous spec- 
trum in H”, we can make a classification of the points of the spectrum. 

Derrinirion. We say that A, belongs to the point spectrum if 1, is an 
eigenvalue of A. If 2, is a limit point of a point spectrum, i.e. there are 
eigenvalues different from A, in any e-neighbourhood of it, 2, is said to 
belong to the limiting spectrum. Finally, 4, is said to belong to the continu- 
ous spectrum if A, belongs to the spectrum of the operator A” induced 
into H” by A, i.e. if the spectral function E; of A is not constant in any 
interval containing A, as an interior point. 

Every point of the spectrum of the operator A belongs to at least 
one of these three categories, though it may happen that say 2, belongs 
to all three categories simultaneously. A further concept is sometimes 
brought in, viz. 2, is said to be a point of condensation of the spectrum 
if it is either an eigenvalue of infinite rank or an element of the 
limiting spectrum or an element of the continuous spectrum. 


151. Differential solutions. Let us take an operator with a purely 
continuous spectrum. Given any 2, the elements @, x satisfy an equa- 
tion analogous to the equation Ax = Ag in the case of a point spec- 


151] DIFFERENTIAL SOLUTIONS 457 


trum. Let A[A,, 4,] be any interval. We can use property (180) to 
write the following analogue of equation (213): 


M Žr 
(242, 42, = f 142, 
m A 


We bring in the element z(A) = &, x, continuously dependent on 
the parameter 4 in the interval [m, M] in the sense that || z(A)) — 
— 2(A) || —> 0 as 2 — A, for any A, of [m, M]. It will be seen from the 
above equation that 2(A) satisfies 


A [Ax (A)] = {Ada (A). (281) 
4 


In this case, z(A) is said to be a differential solution of the equation 
Az = ix. The vectors y,(A) = Z; Yk formed in [149] are therefore 
differential solutions. Given different k, they lie in the orthogonal 
subspaces Cy,, and we therefore have, for any intervals 4, and 4,: 


(AyYp (A), AY (4)) = 0 (282) 

(p #4). 
If 4, and A, have no common interior points, we have, by (180) 
(Yp (A), AY p (a)) = 0 (283) 


(A, and A, have no common interior points). If 4, and 4, havea common 
part 4,» by (180), 


(414 p (å), 42y p (4)) = || 43.2% (A) |P- (284) 


We now bring in the concept of a complete system of differential 
solutions. Any system y,(A) of differential solutions, which is orthogonal 
in the sense of (282), is said to be complete if the element æ is ortho- 
gonal to all the y,(A), i.e. satisfying 


(Yp (A), x) =0, (285) 


for any p and any 4, is the zero element. It may easily be shown that 
the solutions y,(4) = Z, Yp formed above are a complete system. 
For, we can conclude from (285) that the element z is orthogonal 
to the subspace Cy, which is the closed linear envelope of y,(4) = 
= 8, Yp, and this is true for any p. But the orthogonal sum of the 
C,, is the whole of H, and x is therefore orthogonal to the whole 
of H, i.e. is the zero element. 


458 HILBERT SPACE [151 


We started out from the spectral function #, when forming the 
solutions of equation (281). We shall now start out from the equation 
itself. Suppose we have somehow succeeded in forming the solution 
x(a) of (281). Since x(A) appears in this equation only under the dif- 
ference and differential signs, we can obtain further solutions by 
subtracting any element independent of A from 2(A). In particular, 
x(A) — a(m) will be a solution, and we can therefore always assume 
that a(m) = 0. We shall show below that any solution of (281), 
continuously dependent on A in the interval [m, M] and satisfying 
the condition x(m) = 0, necessarily has the form 2(4) = &, x. Suppose 
we have somehow succeeded in forming a finite or infinite number of 
solutions y,(A) of (281), mutually orthogonal in the sense of (282). 
By what has been proved, each of them has the form y,(4) = Z3 Yp, 
where yp is an element of H. The closed linear envelope of each y,(A) 
is a subspace Cy, these subspaces being orthogonal, by (282). The 
completeness of solutions y,(A) is equivalent to the fact that the 
orthogonal sum of the Cy is the whole of H. If these solutions in 
fact form a complete system, we can write the formulae of [149] 
with 2, Yk replaced by y,(A). We can therefore obtain these formulae 
by starting out from any desired orthogonal complete system of 
continuous differential solutions. The completeness of a system of 
mutually orthogonal solutions, no matter how constructed, may be 
verified by (271) for a bilinear functional, or by (273), if the spectral 
function @, is known. 

Notice also that equations other than (281) can be formed for 
a(A4) = Z, x. Supposing that the interval 4 does not contain A = 0, 
we can write, by (180), 

a 
[2a8,- [ya = Ag, 
4 


m 


and hence obtain the equation for 2x(A): 


A [Ja da a] = Ax (A) 
or 


f 4d {As (2)] = Az (A). 
4 


Let us now turn to the proofs of the above assertions. 


151] DIFFERENTIAL SOLUTIONS 459 


THEOREM. Every solution of equation (281), continuous in the interval [m, M] 
and equal to the zero element when 4 = m, has the form x(A) = &, x(M). 
Given that «(m) = 0, equation (281) gives 
à 
f uda (u) = Az (2), 
m 
u being the variable of integration, both here and below. Having fixed two num- 
bers Hı < u; in any manner, we obtain 


À 
f uala (u), @ (Ha) — & (y)) = (Ax (A), © (H) — £ (m)). 


m 


Using (281) and the fact that A is self-conjugate, we can write 


Hr 
(Ax (A), © (Me) — 2 (4) = (@ (A), Ax (m) — Ax (m,)) = $ u d(æ (A), < (H)), 


My 
and hence we arrive at the equation 


a Me 
$ udla (n), Œ (ta) — € (My) = f wd (a (A), æ (u)). 
m wy 
The integral on the right can be integrated by parts, and the mean value 
theorem applied to the resulting integral: 


Aa 
fu d(a (u), æ (2) — £ (m )) = 
m 
= My (@ (A), & (Hy) — Hy (x£ (A), © (1)) — (Me — pi) (@ (A), © (Hg) 
(u; belongs to [4,, #,]), 

or 
a 
fead (@ (x), © (Me) — x (#1)) = 


m 
= (Hg — fy) (Œ (A), £ (Ha)) + Hi (2 (A), © (He) — © (Hy) — (Hz — 4) (a(A), x(u). 


and we can rewrite this last as 





A 
fe mya SO SE EE) L C (A), e (a) — 2 (Hs) 
He — fy 
m 
On introducing the continuous functions 
o (p) = SO EU) EE | 4 (2) = (3), Blo) e) (288) 


this can in turn be rewritten as 





a 
$ (u — m) do (u) = f (å), 
m 


460 HILBERT SPACE {152 
where we assume 4 < H4 < Ha and obviously, w(m) = 0. This last equation 
is readily solved for w(,). All we need do is integrate the left-hand side by 


parts, which gives w(A) = f(A)/(A — u) + u(A), where u(A) is a new required 
function, equal to zero when å = m: 


has Ja Lf ae 


or, on recalling the notation of (286), 





(x (A), © (He) —@(uy)) _ (x (A), £ (My) — x (u3)) + fetes (#2) Ko) ap, 
Hy — fy Amy (u — fy)? 


If pı —> Hg then ug— Hg and the first term on the right-hand side tends to 
zero. The same can be said of the integral, since | x( 4), (g) — æ( ua) | < 
< C || x(H,) — x(u) ||, where C is the greatest value of || x(x) || in the interval 
[m, M]. We have taken the case when 4, kH from smaller values. A similar 
approach can be used when A < H, < H, and Hı— H, It therefore follows 
from the last formula that 


T(t yw (u) =0 for w> A, 


and hence 
(x (A), x (u)) = (x (A), x (A)) = |] æ (A) ||? for u > A. 


On applying this formula with u = M to the solution y(A) = æ(å) — &, 1 , 
of equation (281), which vanishes when 4 = m and A = M, we get || y(A) ||? = 
i.e. a(A) = ë, 2(M), and the theorem is proved. 


152. The operation of multiplication by the independent variable. 
Let us return to the results of [147] and consider the function 
space LS” of the function f(A), square integrable with respect to the 
function @(A) defined by (245). The class LẸ? is the class of functions 
f(A), defined in the interval [m, M], measurable with respect to (A), 
and such that 


M 
J IFA) |2 do (4) < + œ. (287) 


The space LẸ? is a concrete form of space H. In this space, the 
operator of multiplication by the independent variable: 


Ay [f (A)] = Af (A) (288) 


is obviously bounded and self-conjugate, since 


AF(A) | < [LF (A) I, 


152] THE OPERATION OF MULTIPLICATION BY THE INDEPENDENT VARIABLE 461 


where n is the greater of the numbers |m | and | M |, and, since 


A is real, 
M 
(Af (A), g (A) = (F (2), Ag (A) = [Af (A) g (A) de (A). 
m 
We shall now establish the connection between space Cx and the 
function space LS. In view of the existence of the Hellinger integral 
(249), any element y of Cy corresponds to a function y(4) of LS? such 
that [82]: 


Aa 
på) = (y, 8,2) = d y(u) dolu). (289) 


Distinct elements y and z of C, correspond to distinct elements 
y(A) and 2(A) of LS. For, if y and z were to correspond to equi- 
valent functions y(A) and 2(A), we should have, by (289): (y — z, 
g, 2) = 0 for any å. The difference y — z would thus be orthogonal 
to all the linear combinations #,2z, and on passing to the limit, 
y — 2 would have to be orthogonal to the whole of subspace C,,. 
But y — z € Ox, and we should get (y — z, y — 2) = |l y — z ||? = 0, 
ie. y = z. Conversely, in the case of two non-equivalent functions of 
LS, the integral in (289) cannot have the same value for all 4 [52]. 
Thus (289) establishes a one-to-one correspondence between elements 
y of Cx and elements of some lineal M of LS. We shall show first 
that M is a closed lineal. By using Lebesgue-Stieltjes integrals, we 
can rewrite (256) as [82] 


M 
(y, 2) = Í WA) (A) delà), (290) 
m 
where z(4) is the element of LẸ? corresponding to the element z of 
Cy, i.e. 
a 
(z,8,2) = J 2(u) dolu). (291) 
m 


Let y(A) be a sequence of elements of M and y the correspond- 
ing elements of Ox. On putting y = z = y™ — y™ in (290), we get 


M 
Il yo — y™ |2 = s Jya) — y) |2 dolh) . (292) 


If the y(4) tend in the mean to some element y,(4) of L&, the 
right-hand side of (292) tends to zero as n and m —> +, so that 


the sequence of elements y” is mutually convergent, and there 


462 HILBERT SPACE [152 


exists an element u such that y => u, where u € Ox, since Ox 


is a subspace. Let u(å) be the element of M corresponding to the 
element u in accordance with (289). We shall show that the element 
u(A) is equivalent to y,(A). It will follow from this that y,(å) EM, 
ie. that M is a closed lineal. Putting y = z = u — y™ in (290), 
we get 


M 
|u — y ||? = J | w(2) — y6(A) |2do(A) , 


from which it is clear that the y(A) tend in the mean to u(A), so 
that the function u(A) is equivalent to y,(A), since the limit in the 
mean is unique. We now show that the closed lineal M coincides with 
L®. If this were not the case, an element f,(A) of LS? would exist, 
not equivalent to the zero element, and orthogonal to all the elements 
of M. Since Z, Z, = @, for v < å and 8, Z, = @, for v > A, (289) 
takes the form for the element y = @, x: 


v 


(€,2, 8,2) = (2,8,8,2) =||€,F,2||? = f dol) for A>», 


m 


a 
(8, £, 8, x) = f do(A) for à <v, 


i.e. the function f(A) of M that corresponds to 2,2 is equivalent to 
the function given by 


/(4) =1 for A<y and f(A) =0 for A>>. 


The value of f(A) for 2 = » is not important, in view of the conti- 
nuity of g(a). The orthogonality of f,(4) to the function just defined 
gives us, for any v: 


f fo(2) dol) = 0, 


whence it follows, as we know from [52], that /,(A) is equivalent 
to zero with respect to @(A), and the subspace M must therefore 
coincide with LS”. The above discussion gives the following theorem: 

THEOREM 1. Formula (289) establishes a one-to-one correspondence 
between elements y of Ox and elements y(A) of LS. 

In view of (290), the value of the scalar product is preserved in 
this correspondence, and hence the norms of corresponding elements 
are the same. Moreover, the correspondence is obviously distributive, 
since the scalar product (y, @, 2) is distributive with respect to y, 
and the integral in (289) is distributive. Thus, with this correspondence, 


153] THE UNITARY EQUIVALENCE OF SELF-CONJUGATE OPERATORS 463 


the function space LS is a real form of Hilbert space Cx. An operator 
A whose spectral function is %, is defined in this space. Let us prove 
the following theorem: 

THEOREM 2. The replacement of y by Ay in C, corresponds to mul- 
tiplication by u in LS, i.e. the operator A in Ox corresponds to the operator 
(288) of multiplication by the independent variable in LØP. 

By using (206) and (289), and a property [75] of the Lebesgue- 
Stieltjes integral, we can write, on the assumption that y € Cx: 

M 


A 
(Ay, 8,2) = J ud, (8, y, 8,2) = J ud (y, 8,2) = 


A 


= fad, [J a) v) do(»)] = J aylu) dolu) 


m 
i.e. 


(4y, 8,2) = J yl) dolu) , 


whence it follows, on comparing with (289), that replacement of y 
by Ay corresponds to multiplication of y(u) by p. Notice also that 
the general formula (259) for a bilinear functional (Ay, z), where 
y and z € C,, can be written, using (290) and the theorem proved, 
with the aid of the Lebesgue-Stieltjes integral as 


(Ay, z) =f Ay(A) 2(2) delà) . (293) 


153. The unitary equivalence of self-conjugate operators, Let 2, 
be a resolution of the identity for the self-conjugate operator A, U 
be a unitary operator and B = UAU~—1. As may easily be seen, the 
operator 2; = UZ, U -1 is also a resolution of the identity. Let B’ be 
the corresponding self-conjugate operator, so that B's is defined as 
the limit of a sum: 


SA (UZ, U~) £ = U (34,4, 8,)U 2, 
kal 


whence it is clear that B’ coincides with B, ie. Z; = UZ, U -! is 
the spectral function of the operator B. It follows from this that self- 
conjugate operators that are unitary equivalents must have the same 
spectrum. 

In the case of a purely point spectrum, coincidence of the eigen- 
values and their ranks is sufficient as well as necessary for unitary 


464 HILBERT SPAOE [154 


equivalence. The unitary operator U is easily formed as the operator 
transforming the subspace of eigenelements of B into the subspace 
of eigenelements of A corresponding to the same eigenvalue. The 
question of the conditions for unitary equivalence become much more 
complicated in the case of a continuous spectrum. We shall quote 
without proof the fundamental result relating to this case (the space 
is assumed separable). 

The necessary and sufficient conditions for unitary equivalence of 
two self-conjugate operators A” and A are as follows: (1) the spectra 
of the operators are of the same type (purely point, purely continuous, 
or mixed); (2) in the case of a point spectrum, it consists, for both 
operators, of the same eigenvalues with the same rank; (3) in the 
case of a continuous spectrum, the number of invariant subspaces 
in the normal subdivision of the continuous part of the spectral func- 
tion is the same for the operators, and if the functions 


| FO y |]? = oA) and || FE y2 


which we formed in [149], are brought in for the normal forms of the 
continuous spectra of A® and A®, the set of measure zero with 
respect to of(A) must be a set of measure zero with respect to A) 


and vice versa, i.e. for any k, 


2 == of?)(A) ’ 








a 2 
oP) = f oA) dod) and oP(2) = S oA) dofP(A), 


where g{?(A) is measurable with respect to o®(A), non-negative and 


summable, and similarly for oA). 


154, The spectral resolution of unitary operators. Let A be a self- 
conjugate operator and @, its spectral function, where 


&,=0 for A=0 and g, = E for 4=1. (294) 


We form the operator U in accordance with the formula 
1 
U= { er4dF, = ert, (295) 
0 


The conjugate operator can be formed simply by replacing the 
function e™"® by its conjugate e-*" [143], i.e. 


1 
U* = f e72mi dz, ’ 
0 


155] FUNCTIONS OF A SELF-CONJUGATE OPERATOR 465. 


and, in view of (214), we arrive at the equations UU* = U*U = 
= @, ie. the U given by (295), with condition (294), is a unitary 
operator. The converse will be stated without proof. 

THEOREM. If we take all the possible resolutions of the identity, satis- 
fying conditions (294), formula (295) represents the general form of a 
unitary operator, where distinct unitary operators U correspond to 
distinct resolutions of the identity @;. 

In [148] we defined the function /(A) of a self-conjugate operator 
A, corresponding to a continuous function f(t). We shall generalize 
this definition in the next section to a wider class of functions /(t) 


155. Functions of a self-conjugate operator. Let A be a self- 
conjugate operator and @, its spectral function. If f(å) is continuous 
in the interval [m, M], the operator f(A) has been defined by 


M 
= J H(A) aZ, 


or by the equivalent formula for a bilinear functional: 


(f(A) 2, y) = i f(A) dZ, T, Y) » (296) 


where (&, x,y) is a complex function of bounded variation of 4. 
As we know [125], it is a linear combination of four non-decreasing 
functions of the form || Z, z ||?, where z is an element of H. Thus, 
if f(A) is any bounded function, measurable with respect to the non- 
decreasing functions 


[82l]? (297) 


for any choice of z, integral (296) exists for any x and y, and a bilinear 
functional (f(A) x, y) is thereby defined. This functional is clearly 
distributive, since (Z, x, y) is distributive. Let us show that the 
functional is bounded; we shall assume that f(A) is real, though this 
is not actually essential. Since f(A) is bounded, | f(A) | < C, where 
C is a positive number. On putting y = xv, we arrive at the expression 
for a quadratic functional: 


M 
(f(A) 2,7) = 1 f(A) dl] Fz |[?. (298) 


466 HILBERT SPACE [155 


Remember that, if Zm is non-zero, the last integral is equivalent 
to the sum 


M 
(m) [Enele J f(A) d||F, rle, 


where the integral is understood as the ordinary limit of a sum- 
Since | f(A) | < C and || %,, 2 || = || x ||, we have for integral (298): 


(f(A) x, 2)| < Cle]. (298,) 
We have further, on using @ to denote the real part: 


M 
aR f f(A) d(F,2,y) = 


M M 
= | fade +u) e +y) — $ fAdE(e—y,e—y), 


or 
M 
AR § KA AEE y) = 
M yi M 
= f Aage +e f Ade- yle, 
and, by (298,), 
sat a) AF, x, y) | < Cle + yl? + ile — y l2] = 201l e |? + Iy (21, 


On arguing as in [122], the same expression as above can be obtained 
at once, but without the sign of the real part: 


4| fi a(g, æ, y)| < 201|]? + Iy l2]. 
When || 2 || = || y || = 1, we obtain 
M 
(HAL = S IDEC. (299) 


If x and y have an arbitrary norm, we can write 


7 ey 
(Aay) = lel yil AA Fo Ten) 


where the norm of the elements zj/|| æ || and y/|| y || is unity, and, 


in view of (299), 
(4z y) |< Cle lly l 
whence it follows that the bilinear functional (f(A) æ, y) is bounded. 


155] FUNCTIONS OF A SELF-CONJUGATE OPERATOR 467 


We thus arrive at the following fundamental result: if f(A) is a 
bounded function, measurable with respect to non-decreasing functions 
(297) with any choice of z, then (296) defines a linear operator f(A). 
Some properties of the operators /(A) must be mentioned. If f(A) is 
real, it follows at once from (296) with y = z that f(A) is a self-conjugate 
operator. If f(A) is complex, the conjugate operator /(A)* is obtained 
from (296) with f(A) replaced by the conjugate function. If f(A) > 0, 
it follows from (298) that f(A) is a positive operator. Let us take the 
function /,(A), defined as follows: 


f(A) =1 for å< u and f,(A) = 0 for A> pb. (300) 


This is obviously a B function, and we can form integral (296). 
On subdividing the domain of integration into [m — £p u] and 
[u, M], we get 


g 
(f,(A) 2, y) = J d(Z,2,y) = (Z T, Y), 


whence it follows that 


Z, = f,(A). (301) 


This formula is obtained on the assumption that every self-conjugate 
operator A has a spectral function &,, in terms of which it can be 
expressed with the aid of (204), i.e. when deducing (301) we have 
taken as our basis the theorem of [142], which is not yet proved. 
The proof will amount in essence to defining for any self-conjugate 
operator A the function f,(A), without making use of the spectral 
function %,, after which we prove the fundamental formula (204) 
by putting Z, = f,,( A). 

The familiar properties of the Lebesgue-Stieltjes integral can be 
used to obtain quite easily the properties of a function of a self- 
conjugate operator A. It will naturally be assumed here that all the 
functions f(A) discussed belong to the class defined above, i.e. are 
bounded and measurable with respect to functions (210) with any 
choice of z. 

THEOREM 1. The operator corresponding to a linear combination of 
functions: a, f,(4) + az f(A) + --- + ap folh) is a f(A) + a f(A) + 
+ sau + apf (A). 


468 HILBERT SPACE [155 


2. f(A) commutes with E, and A. 
3. We have 


(f(A) z, fal A) y) = T f(A) fal@) EEIE (302) 
4. The operator corresponding to the function f,(A) f,(A) is 
f(A) f(A) = f,(A) f(A) E (303) 
Let us show e.g. that f(4) commutes with @,: 
M 
(4,24) = ff) AF ,F 2,9) = ii f(A a(B,,2,8,y) = 


= (f(A) £, 8 Y) = (Z A) 2,4), 


whence it follows that f(A) Z, = &, (A). Notice also that, by (180), 
it follows from the above formulae that 


(f(A) 8,09) = Í IA d(F,2, y). (304) 


Formula (302) is obtained from the following chain of equations: 


M 
(H(A) fA) = | hA AEE, f2(A) y) = Pn f(A) AfA) 8, y, 2) = 


M aae aa] M 
= J hA aff hlu) AF, y, £ ]= fha d(&, x,y). (305) 
Finally, 


M 
(FLA) fA) 2 y A f(A) d(%, f(A) £, y) = 


ê; M 


M 
= SAAd SAAE, zy) = SAARA AE y) 


and similarly for the product f,(A) f,(A). It can be shown, precisely 
as in [143], that f(4) commutes with any operator B that commutes 
with A. The converse is also true, i.e. if a bounded linear operator 
C commutes with any operator B that commutes with A, there 
exists a function f(A) such that C = f(A). The proof of this important 
proposition can be found in F. Riesz’s article Functions of Hermitian 
operators in Hilbert space (O funktsiyakh ermitovykh operatorov 
v gil’bertovom prostranstve) (Uspekhi matematicheskikh nauk, IX). 


156] COMMUTING OPERATORS 469 


If we put f,(A) = f,(A) and y = g in (305), we obtain 


M 


f(A) l2= f IAA Pal ee]. (306) 


m 


Some further simple facts may be mentioned, in regard to functions 
of a self-conjugate operator. If | f(A) |= 1, then f(A) is a unitary 
operator. If f(A) takes only the values 0 and 1, f(A) is a projector. 
Let us show that, if 2 = A, is an eigenvalue of A and z, a correspond- 
ing eigenelement, /(A,) is an eigenvalue of f(A) with the same eigen- 
element 2). We know that @, x, = 0 for A < Ad, and &,”, = £o for 
A > Ay, and (296) gives us for any y: (f(A) £o y) = f(Ao) (Xo, Y) = (F 
(Ay)Zo, y), Whence, since y is arbitrary, we have f(A) £o = f(A») £o- 
Notice that, if f(A) has a finite number of discontinuities, it is 
measurable with respect to all the functions (297), so that f(A) has 
a definite meaning. The same will be true when f(A) is a B function 
[47]; this fact has already been used above. 


156. Commuting operators. We must now consider the problem of commuting 
self-conjugate operators. 

THEOREM l. The necessary and sufficient condition for two self-conjugate 
operators A and B to commute is that their spectral functions E, and F „ commute 
for any A and u. We know that the spectral function of any self-conjugate 
operator C commutes with C, and with any operator that commutes with 
O [143]. It follows from this that, if AB = BA, F,, commutes with A, so that 
č, commutes with F,. Conversely, if ë, commutes with F„ the Riemann- 
Stieltjes sums in the integral forms of operators A and B commute, so that 
the operators themselves commute. 

THEOREM 2. If the self-conjugate operators A, B and C, having purely point 
spectra, commute in pairs, there exists a closed orthonormal system of elements 
which are eigenelements of these operators. 

Let č}, F, and G, be the spectral functions of the operators. By Theorem 1, 
they commute in pairs. Let å, 4 and » be any given eigenvalues of A, B and 
C, and L,, M, and N, be the subspaces of the corresponding eigenelements, 
whilst A, = €,— 6,93 4) = F,—F,-03 4 u= G,—G,_» are the projectors onto 
these subspaces. These projectors mutually commute, so that their product 


Srur = 44 47 A,” 


is the projector onto subspace #,,,, which consists of elements common to 
Lı M, and N,[140]. If we take two distinct subspaces Rj, and Ry,,,y,, at least 
one of the number pairs (A, 4’), (4, u’) and (v, v’) consists of different numbers. 
Suppose say å Æ 4’. Now, if x € R,,, and g'e R,,,,,,,, then zand 2’ are eigen- 
elements of A corresponding to different eigenvalues, i.e. they are mutually 
orthogonal. The subspaces #,,, are therefore mutually orthogonal. We must 
show that their orthogonal sum is the whole of H. All we need do is show that 


470 HILBERT SPACE [156 


no non-zero element exists, which is orthogonal to all the subspaces Ryuv i.e. 
we need only show that, if an element x, is non-zero, it is not orthogonal to at 
least one of the R,,,. This is obvious for the subspaces L,, M, and N,, since 
A, B and C have purely point spectra by hypothesis, so that the orthogonal 
sum, say of L,, is the whole of H. Let us take an element x, # 0. By what 
has just been proved, an eigenvalue A of the operator A exists such that 
d £a % 0. Further, by the same arguments, an eigenvalue yp of the operator 
B exists such that LACA zo) #0, and an eigenvalue of C exists such that 
A" (47 4,2) # 0, whence it follows that x, is not orthogonal to R,,,. There- 
fore, the orthogonal sum of R,,, is the whole of H. If we take a closed ortho- 
normal system in each R;,,, we get an orthonormal system, closed in H, 
and each element of it, belonging to some R,,,, is an eigenelement of each 
of the operators A, B and C. The proof of the theorem is exactly the same for 
any finite number of mutually commuting self-conjugate operators. We saw 
above that different functions of the same self-conjugate operator are commut- 
ing operators [155]. Let us now prove the converse, for the case when the opera- 
tors have a purely point spectrum. 

THEOREM 3. If the self-conjugate operators A, B and C, having purely point 
spectra, commute in pairs, they are functions of the same self-conjugate operator D. 

Space H is assumed separable. By Theorem 2, there exists a closed ortho- 
normal system of elements 


i Wey Bayesi 
which are eigenelements of A, B and C, i.e. 


Atty = An ty Boy = Uy hy Cty = Paia 
(n = 1, 2,...). 


Let Em = l/m, and let x be any element. It can be expanded in elements 
Ly: 


c= J ay ty, 
k=1 
and we define the self-conjugate operator D by putting 
De = J ay 0, ty. 
k=1 


The series on the right-hand side is clearly convergent, since the numbers 
| a, |? form a convergent series, i.e. | a, 9; |? all the more form a convergent 
series. It follows at once from this definition that the xp are eigenelements of 
D, corresponding to the eigenvalues op, i.e. D has a purely point spectrum. 
We can form the bounded function f,(4), equal to A, at the points 4 = 9, and 
continuous everywhere, with the possible exception of the point 4 = 0. Simil- 
arly, we can form f,(A) with similar properties, so that f.(e,) = Hk, and falà) 
so that f,( 2x) = vk. By the results of the previous section, corresponding opera- 
tors f(D), f(D) and f,(D) can be formed for the functions f,(A). The operator 
/,(D) has eigenelements 2, and eigenvalues f,(0,) = Ay, where the x, form a 
closed system. The operator A has the same eigenvalues and eigenelements. 


157] PERTURBATIONS OF THE SPECTRUM OF A SELF-CONJUGATE OPERATOR 471 


But if two operators with purely point spectra have the same eigenvalues and 
corresponding eigenelements, their integral forms in terms of the spectral func- 
tion imply that they coincide, i.e. A = f(D). Similarly, B = f(D) and C = 
= f,(D); the theorem is proved. Notice that the proof will be exactly the same 
for any finite number of self-conjugate operators. The theorem can also be 
proved in the case when the operators do not have purely point spectra, but 
we shall not dwell on this (J. Neumann, Annals of Math., t. 32, 1931). 


157. Perturbations of the spectrum of a self-conjugate operator. Remember 
that the points of condensation of the spectrum of a self-conjugate operator 
are defined as the values of å which are either limit points of a point spectrum, 
or eigenvalues of infinite rank, or points of a continuous spectrum. We shall 
prove below the following theorem. 

THEOREM l. If a completely continuous self-conjugate operator C is added 
to a self-conjugate operator A, the set of points of condensation of the spectrum 
remains unchanged, 

Nevertheless, the addition of a completely continuous operator can sub- 
stantially change the nature of the spectrum. In fact, the theorem holds: 

THEOREM 2. We can add to any given self-conjugate operator A a self-conjugate 
completely continuous operator C, with absolute norm not exceeding any given 
positive number £, such that A + C has a purely point spectrum. 

The following can be proved by making use of this last theorem: 

THEOREM 3. If the self-conjugate operators A, and A, have the same set of 
points of condensation of the spectrum, there exist a unitary operator U and a 
self-conjugate completely continuous operator C such that A, = UA,V-'+0C. 

We shall only prove Theorem 1. Two preliminary lemmas are required. 

Lemma 1. If å = uisa point of condensation of the spectrum of a self-conjugate 
operator A, there exists a sequence of normalized elements xn, weakly convergent 
to zero, such that 

|| Atn — uty |] > 0. (307) 


If x is a limit point of a point spectrum or an eigenvalue of infinite rank, 
there exists an infinite sequence of mutually orthogonal normalized elements 2, 
such that the corresponding eigenvalues 4, tend to x. If z is any element, its 
Fourier coefficients Cn = (z, £n) tend to zero, and consequently x, YY, 0, and 
the lemma follows for this case from the expressions 


||Aay — way || = || (A — Ag) £n + (Aq — H) Bq || = | Ag — u 











Ly ||=|4_—#|- 


Now let u be a point of a continuous spectrum, and ë; the continuous part 
of the spectral function. The difference €7,,, — n-a, given any small positive 
ô, is a projector into some subspace L;. We take a sequence of positive numbers 
6, such that 6, — 0, and a sequence of normalized elements x, of La, We shall 
show that the lemma is also true in this case. By definition of the 6,, we have 
(Enton — Eu-r) Tn = Zp, and for any element z: 


(2, En) = (2, (Enta — Eun) Xn) = ((Eut in — Sp—8p) Z» En) 
| (2, £a) | < || (Gut, — Eu—ðn) z |l. 


472 HILBERT SPACE (157 


But we have Eata n — nt, > 0, and the weak convergence x, W, 0 is proved. 
To prove (307), we have to use the following obvious formula: 


M 
|| At_ — He, ||? = dy (A — u)? d (Eim Xp) = 


M 
= § (A= u) d (E4 (p48, — Eun) £n En) = 


m—8, 
B+6, ; 
= $ (4 — pd] (81 — Euin) tq |? < OR || (Ente, — Epon) 2n ||? = 64 — 0. 
K—ôn 


LEMMA 2. If A= u is not a point of condensation of the spectrum, there 
existe for any sequence of normalized elements xp, weakly convergent to zero, a 
positive number a such that, for all sufficiently large n, 


|| Az, — Han || > a >0. (308) 


By hypothesis, there exists a positive number d such that the spectral func- 
tion ¢, is either constant in the interval u — d < å< u +d, or its variation 
amounts to a jump at the point 4= yu, the subspace L, corresponding to 
this jump being finite-dimensional. We have 


M p-d 
| Atn — uzr llè = f A—pP dian? > f A utd |l lanli + 


m — & m— £ 


M 
+ $ A ma] Etn ll > E| |l En-a £n |? H (|| 2a ll? — ț Eutazn ll] = 
u+d 
=d — d [ | Futa Lall? — || Eu-d £n l|?], 
or 
|| A%_ — Hay |]? > d? — d? (Spt — En-a) Em Tn). (309) 


If Euta — &,_¢ = 0, we get (308) by putting a = d. Now let ĉ, have a jump 
at A= uy, and let z,, Za, . . ., Zņ be a complete orthonormal system in L,. Now, 


m 
(Eud Fa Ey-a) Trn = > (Em 2s) Zs, 
s=1 


we have (8,44 — &,-4)%=>0 since a, “, 0, and it follows from (309) that 
(308) is satisfied for sufficiently large n, if we put say a = d/2. It follows at 
once from these lemmas that the necessary and sufficient condition for A = u 
to be a point of condensation of the spectrum is that a sequence of normalized 
elements x, exist such that x, %, 0 and (307) holds. 

The proof of Theorem 1 is now quite easy. We add a completely continuous 
self-conjugate operator C to the operator A, and let A, = A + C. Now, 4 = 
= Á, + (—C), where (—C) is also a completely continuous self-conjugate 
operator. Let 4 = y be a point of condensation of the spectrum of A, and zp 
a sequence satisfying condition (307); we now have Cz,=> 0, since zy, 0, 
and it follows from || 4,2, — Hzn || < || Azn — way, || + || Ca, || that |] Aya, — 


158] NORMAL OPERATORS 473 


— px,|| > 0, io. A= u is a point of condensation of the spectrum of Aj. 
Similarly, it follows conversely from A = A, + (—C) that every point of con- 
densation of the spectrum of A, is also a point of condensation of the spectrum 
of A; the theorem is therefore proved. 


158. Normal operators. Another particular type of linear operator must be 
mentioned. A linear operator A is said to be normal if it commutes with its 
conjugate [cf. IV, 41], ie. 

AA* = A*A. (310) 

Self-conjugate and unitary operators represent a particular case of normal 
operators. If we put 


1 l 
A,=-(A+A4*), A= 5r (A—A*), (311) 


we can express A and A* in terms of self-conjugate operators A, and A,: 
A=A,+iA,; A*= A — id: (312) 


These formulae have the immediate consequence that the necessary and 
sufficient condition for an operator to be normal is that the self-conjugate 
operators A, and A, commute. If this is the case, the spectral functions a) 
and g(a) of these operators commute for any A and yu. We define a family of 
projectors čą, depending on the complex variable a = 4+ yt, by putting 


E,= GEO (a= 4+ pi). (313) 


This projector &, will only be variable in some interval A, of the plane of 
the complex variable a, and we shall have formulae for the operator A, precisely 
analogous to the formulae for a self-conjugate operator: 


A=ffaddg,; (Az, y) = f f add (S42, y). (314) 
4 d 


Let us prove say the second of these formulae. Let the interval 4, be defined 
by the inequalities a < å < b;c < p < d. We have 


, f add (8a, y) = f fadada (EP 82) a, y) +i SJ udad y (4082) x, y). 


After forming the Riemann-Stieltjes sums, we can sum over u in the first 
of these integrals, since the integrand is independent of A; it must be recalled 
here that 2) = 0 and a) = F. Similarly, we can sum over A in the second 
integral, where €) = 0 and ef) = E. We therefore obtain 


b d 
f fadd (Eux, y) = Sad (ER 2, y) +i J ud(s@x, y) = (Asx, y) + i (4:7, y). 
4, a c 
On the other hand, 

(Ax, y) = (Az, y) + i (Aare, y). 


and a comparison gives us the second of equations (314). A general theory 


474 HILBERT SPACE [158 


of normal operators can be further developed in analogy with the theory of 
self-conjugate operators. 

Suppose that, given the normal operator A, the self-conjugate operators 
A, and A, have a purely point spectrum. We can take a closed ortho- 
normal system of elements zp, (k = 1, 2,3,...) which are eigenelements of 
A, and A, [156], i.e. 


Ait, = wD Tp; Atg = UE. 


Now, obviously, 
Azk = (Ay + 64g) ay = (UP + HOi) ap. 


and the x, are therefore the eigenelements of A corresponding to the eigen- 
values pi) + udi. 

Let us also take the case when the normal operator A is completely continuous. 
We know that the operator A* is now also completely continuous, and by (311), 
the operators A, and A, are completely continuous. The theorem of [136] may 
readily be extended to the case of normal completely continuous operators. 
Let up be non-zero eigenvalues of A,, and 2, the corresponding eigenelements, 
i.e. 

A {Ly = Hpk 


Recalling that A, commutes with A,, we obtain on applying A, to both sides: 
A, (422k) = Ky A Xp, 


i.e. A, x, is either the zero element, or the eigenelement of A, corresponding 
to the same eigenvalue. Suppose that y, is an eigenvalue of rank h and that 


Hy = Mey, = -+ = Msn Now, in view of what has been said, we must have 
k+h-1 
A,zj= X ests 
s=k 


G=kk+1,...,.k+h—1) 
and 
cjs = (Az Lj» Ts) = (£j, Arts) = sjy 


ie. the cj; form a finite Hermitian matrix. We can reduce this matrix to the 
diagonal form by means of a unitary transformation of the x, (which makes no 
essential difference), and hence we can write, on retaining the previous notation 
for the elements: 
Aas = Kjaj Arey = rj; 
(=k, k+l, -e.s k +h-—l), 


where some of the numbers v;, or even all of them, may be zero. We can carry 
out this operation for all the non-zero eigenvalues of A,. After this, all the non- 
zero eigenvalues of A, may not be obtained. If we take the eigenvalues that 
have not been obtained and carry out an operation similar to the above, pro- 
ceeding from A, and passing to A,, woe finally get a finite or denumerable set 
of elements y, (k = 1, 2, ...), orthogonal in pairs, normalized, and such that 


Ayn = BD yg; Ate = EDs 


159] AUXILIARY PROPOSITIONS 475 


where at least one of the two real numbers py and xf?) is non-zero, whilst every 
eigenelement of A, that corresponds to @ non-zero eigenvalue is linearly 
expressible in terms of a finite number of yg, and similarly for A, We also 
obviously have 


Ay, = (HY + aiya A* yy, = (HD — HOM) Yy. 


Suppose that 
x = Ay = Ay + idy. 


Ay = S ay, and Ary = Y byw 
k k 


We have [136]: 


so that any element x expressible in the form Ay can be expanded in elements 
Yk: ‘ 
x = Ay = X (ay + dt) yg. 
k 


Notice that, if say Py = 0, the term containing yx will be absent in the expan- 
sion of A, y. We saw above [155] that, if an operator A is a function of a self- 
conjugate operator B, A* is a function of B, so that A and A* commute, i.e. 
A is a normal operator. Thus any function of a self-conjugate operator is a 
normal operator. The converse is also true: every normal operator is a function 
of a self-conjugate operator. For, let A = A, + iå, be a normal operator. 
The self-conjugate operators A, and A, commute, so that, as remarked in 
[156], they are functions of the same self-conjugate operator B: A, = F,(B) 
and A, = F,(B). On forming the function F(A) = F,(å) + 7F,(A), we get A = 
= F(B), which is what we wanted to prove. 


159. Auxiliary propositions. The present and subsequent sections are devoted 
to proving the fundamental theorem of [142] and the fact that, if an operator 
commutes with a self-conjugate operator A, it also commutes with its spectral 
function č} for any 4. We can make use of the results which were obtained 
prior to [142], when developing the proof. Certain supplementary lemmas 
first need to be proved. 

Lemma 1. If A and B are commuting self-conjugate operators, satisfying the 
relationship 

A? = B, (315) 


and P is the projection operator into subspace L, formed by the elementa x that 
satisfy 
(A+ B)xz=0, i. e. Ax = — Ba, (316) 


the following properties may be proved: 
(1) if an operator D commutes with (A + B), it also commutes with P; 
(2) of Ax = 0, then x € L, i.e. Px = z; 
(3) the operator A can be expressed by the formula 


A = (E — 2P) B. (317) 
1. We have by hypothesis: 
D(A + B) = (4 + B)D. (318) 


476 HILBERT SPACE [159 


If x € L, by (316), D(A + B) x = 0, so that (A + B) Dx = 0, io. Dx e L 
also. Let z be any element of H; then Px ¢ L and, by what has just been proved, 
PDz ¢ L, so that we can write for any element z of H: PDPz = DPz, i.e. 


PDP = DP. (319) 


On passing to conjugate operators in (318) and recalling that A and B are 
self-conjugate, we get [124]: 


(A + B) D* = D* (A + B), 
i.e. D* also commutes with A + B, and we can write (319) for it, i.e. 
PD*P = D*P. 


On passing to conjugate operators in this equation, and noting that P is 
self-conjugate, we get PDP = PD. Comparison of this equation with (319) 
gives DP = PD, i.e. D in fact commutes with P, as we wished to prove. In 
particular, A and B commute with (A + B) by hypothesis, so that A and B 
commute with P. 

2. It follows from the equations 


|| Az ||? = (Az, Az) = (A?z, z}; || Bz ||? = (Bz, Bz) = (Bz, z) 


and condition (315) that |} Az || = || Bz || for any element z. If Ax = 0, Bx = 0 
also, so that x satisfies (316), ie. x € L and Px = v, which is what we had to 
prove. 

3. Using (315) and the fact that A and B commute, we have (A + B) 
(A — B) = 0, i.o. if z is an element of H, then (A — B)z ¢ L, so that P(A — 
— B)z = (A — B)z, ie. 


P(A—B)=A-B. 
Given any element z, the element Pz € L, so that (A + B) Pz = 0, i.e. 
(A+ B)P=0. 


On subtracting from this the previous equation, and noting that A and B 
commute with P, we get 2PB = —A + B, whence (317) follows; the lemma 
is proved. 

Lemma 2. If the self-conjugate operator C > 0, and the self-conjugate operator 
F commutes with C, then F?C = CF? > 0. 

Using the notation Fx = y, we can write by hypothesis: 


(OF? x, x) = (FCFa, x) = (CFx, Fx) = (Cy, y) > 0. 


which proves the lemma. A particular case is worth noticing. If the projector 
P commutes with C, we can say that PC > 0, since P? = P. 

If P(t) =a, +a,t+ ... + ant" is a polynomial and A is an operator, the 
polynomial can be associated, as we have seen, with the operator P(A) = 
=a,H+a,A-+...++a4,A". If A isa self-conjugate operator and the coef- 
ficients a, are real, P(A) is also self-conjugate. We shall need two further lemmas 
before investigating the properties of operator polynomials. 


159] AUXILIARY PROPOSITIONS 477 


Lemma 3. If the polynomial P(t) is positive in the interval [0,1], for all 
sufficiently large values of p it can be written in the form 


P 
P(t) = X e (1— t)P™S, (320) 
s=0 


where all the coefficients c, are positive. 
This follows for first degree polynomials from 


P (t)=c,(1—t) + ct, where c, = f (0) and c =} (1). 


Let us take a positive second degree polynomial, that does not split up 
into real first degree factors: 


P (t) = a + 28t + yt? (a>0,y>0, ay — & >0). 
Using the formula 


k S 
[06 -)+is y owa, 
s=0 
we can write the polynomial as 


P(t)=a 5 Opes (1 — t)P~$ + 2pt 5 OSAT a tP 8 
s=0 


S=1 
+ yt? b3 OST? (1 — tPS, 
s=2 
or, on collecting like terms, 
P(t) = y p=- sa — t)P™ [p (p — 1) a + 28 (p — 1) 8 + 8 (8s — 1) y]. (321) 


The expression in square brackets is positive for all real s and for all sufficiently 
large p. For, the discriminant of this quadratic form in 3: 


1 
p (p — 1) ay — -y (2p8 — 26 — y}? = 
= p? (ay — $°) + p (28° + 28y + a) — — (28 + y), 


is positive for all sufficiently large p in the case ay — £f? > 0. Hence (321) 
in fact leads to (320) with positive c, for all sufficiently large p. Let us now 
take any positive polynomial in the interval [0, 1]. It can be written as thə 
product of positive polynomials of the first degree and positive second degree 
polynomials with imaginary roots. We have an expansion (320) for each factor. 
We thus get a similar expansion for their product, the degree p being equal to 
the sum of the degrees of the individual factors. 

Note. We can use the change of variable ¢, = (t — a)/(b — a) to reduce 
any finite interval a < t < b to the interval 0 < t, < 1, and the following 


478 HILBERT SPACE [160 


formula is obtained instead of (320) for polynomials positive in the interval 
[a, b]: 
p s -s 
P(t)= > c; (t — a)? (b —2)P-S. (322) 


s=0 


Lemma 4. If m and M are the bounds of a self-conjugate operator A, i.e. of 
the quadratic functional (Ax, x) with || x || = 1, and P(t) is a polynomial, non- 
negative in the interval [m, M], P(A) must be a positive operator, i.e. 


(P(A) x, 2) > 0. (323) 


It is sufficient to prove the lemma in the case when P(t) > 0 in the interval 
[m, M]. For, suppose that the lemma is proved in this case, and that Q(t) > 0 
in [m, M]. On putting P(t) = Q(t) + £, where e > 0, we have P(t) > 0 in 
[m, M], so that, by the above: 


((Q (A) + £) x, x) = (Q (A) x, x) + e (x, x) >0. 


On passing to the limit as € — 0, we get inequality (323) for Q(4). Let us 
turn to the proof for positive P(t). By (322), with a = m and b = M, it is suf- 
ficient to show that the operator 


(A — mE) (ME — A)P-S (324) 


is positive, the number p being taken as odd (the sum of positive operators is 
positive). Suppose say that s = 2j is even, and let (324) be written as 


AIA,, 
where 
p-2j-1 


A,=(ME— A); A,=(A—mEY(ME—A) è? , 


A, commutes with A,, and 4, is a positive operator, since (4, x, x) = M — 
— (Az, x) > 0 for || x || = 1. By lemma 2, we can say that operator (324) is 
positive. When ¢ is odd, we have to take A, = (A — mE). 

COROLLARY 1. If the polynomials P,(t) and P(t) satisfy P(t) > P,(t) in 
the interval [m, M], i.e. P,(t) — P,(t) > 0, then P(A) > P,(A). In particular, 
if | P(it)|< £, ie. —e< P(t)< £, then —eH < P(A) < eH, ie. —e< 
< (P(A) a,x) < e for ||x|| = 1, so that the norm of P(A) is not greater 
than e [126]. 

COROLLARY 2. It follows from the previous corollary that, if a sequence of 
polynomials P,,(t) tends uniformly in the interval (m, M] to a polynomial Pit), 
then P,(A) — P(A), and the norm of the difference P(A) — P(A) tends to zero. 


160. Power series of operators. Let us recall the lemma proved in [131], 
the result of which amounts to the following: if the norms of a sequence of 
operators A, (n = 1, 2,3, ...) do not exceed the positive numbers 6,, which 
form a convergent series, the series 


co 


A= SA, 


n=l 


160] POWER SERIES OF OPERATORS 479 


is convergent, and the norm of A does not exceed the sum of the numbers 4,. 
In particular, if we have the power series 


5 ant", 


n=0 


which is absolutely convergent in the interval |¢|< k, and the norm of the 
operator A does not exceed k, then 


is convergent. 
We shall require below the following binomial formula: 


TF= 5 a)” jth <1, (325) 
n=0 n 


+) Ate GG) 


n! 


where 





(326) 


Formula (325) gives the arithmetic value of the radical and remains valid 
with ¢ = +1 [I; 138]. The coefficients of expansion (326) are positive for odd 
and negative for even n > 0. Hence, on putting t = —1 in (325), all the terms 
except the first become negative, whence it follows that 


1 1 

n n 
and series (325) is absolutely and uniformly convergent for |t| < 1 [I; 146]. 
On replacing ¢ by # — 1 in (325), we obtain on the left the absolute value of 


the square root of #, i.o. the absolute value | ¢|, and the following expansion 
is obtained for it into an absolutely convergent series in the interval |¢| < 1: 


or 5 


n=l 


oo 


0=1- 5 


n=1 


= 1, (327) 














æ {1 
i= > [2|"- 1)". (328) 
n=0\ n 


This expansion may be applied to a self-conjugate operator. Let A be self- 
conjugate, with norm m4. We form the self-conjugate operator C = (A?/m%4) — 
— E. We have 








1 1 
2 2 Pe D eee ane 2 
whence it is clear that —1 < (Cx, x) < 0 for || 2|| = 1, whilst the norm of 
C does not exceed unity. The series can be formed: 
æ f 1 æ /l 
B= maS g|" =m 5 z| eE]. (pap 
n=0| n n=0 | p | \™A 





480 HILBERT SPACE {161 


If S,(¢) is a segment of series (328), S2(t) — ¢ uniformly in the interval 
[—1, +1], so that S3(A/m,) — A*/m4, and in the limit the self-conjugate opera- 
tor B, given by (329), satisfies B? = A?. 

Further, if the operator D commutes with A, it commutes with the segment 
of series (329), and hence commutes with B in the limit. It follows from this, 
in particular, that A commutes with B, i.e. AB = BA. 

Let us show further that B is a positive operator. Recalling that the norm 
of C is not greater than unity, we get | (C" x, x)| < || æ ||?, and, after writing 
(329) in the form 

Banal + » [= |<"): 


H 
[> [iai 


whence it follows, by (327), that (Bx, x) > 0. 

Thus the following properties are finally obtained: B is a self-conjugate 
positive operator, commuting with A and satisfying the equation B? = A?; 
every operator that commutes with A also commutes with B. We shall use the 
operator B and lemma 1 in the next section to form the spectral function &, 
of the operator A and thus prove the fundamental formula (204) of [142]. 


we arrive at the inequality 


(Bx, £) > ma (ae > 


n=1 








| (O"z, x) | > 


> MA b= >) 
n=1 








161. The spectral function, THEOREM. Given any self-conjugate operator A, 
there is a corresponding projector E, with the following properties: (1) if an operator 
D commutes with A, it also commutes with E; (2) if Az = 0, then z =z; 
(3) the self-conjugate operators AË, and A(E — €,) satisfy the conditions 


Ag, <0; A(E—&,) >0. (330) 


We take as &, the projector P of lemma 1. If D commutes with A, it also 
commutes with B, and hence with (A + B), and the first two statements of 
the theorem follow from Lemma 1. Further, it follows from A = (E — 2€,) B 
and the fact that ë, commutes with A and B, and ¢ z= čo that 


AE, = — Bé; A(E—&) = B(E—&,). 


But the products of the positive operator B with the projectors č) and 
(E — ĉo), with which it commutes, are positive operators, and inequalities 
(330) follow at once from the last formulae, and the theorem is proved. 

Let 4 be any real number. We can form for the self-conjugate operator (A — 
— 4E) the projector mentioned in the last theorem. Call it č}. It has the follow- 
ing properties: (1) if an operator D commutes with (A — AE), or, what amounts 
to the same thing, with A, it commutes with &,; (2) if (A — AE) z = 0, then 
č, z = 2; (3) we have the inequalities 


(A —AE)8, <0; (A —AE)(E—&)) > 0. (331) 


161] THE SPECTRAL FUNCTION 481 


Notice also that, given any A, č} commutes with (A — AZ), i.e. with A. 
Let us show that &, represents a resolution of the identity. Every ë a, commutes 
with A, and hence, by what has been proved, with any &,,. Let A < m. We shall 
show that č, = 0 here. If this were not the case, we should have an element 
x with unit norm such that č; x = x, so that 


((A — AE) €x, £) = ((A — AE) x, x) = (Az, x) — å > 0, 


since 4 < m, and this contradicts the first of inequalities (331). Thus ë} = 0 
for A < m. Similarly, by using the second of (331), it can be shown that ë} = E 
for å > M. It remains to show that ¢, < ë, for A < p, io. that 6,6, =& 
for A < u; or, what amounts to the same thing, we have to show that 


Er (E —6,) =0. (332) 
Let us write R for the left-hand side: 
éa (E — ép) = (E — ĉu) E= R. (333) 


We want to show that, given any element x, we have Rx = 0. Let us write 
Rv = y. It follows at once from (333) that 


8k = 6] (E — &,)=&(£—6,)=F and similarly (E—6,)R=R. (334) 
We have by (331): 
((A — 4B) Ewy, y) <0; ((A — uE) (E—8,) Y, Y) > 0. (335,) 
On the other hand, by (334), 
Ey = ERr = Re = y; (E— ëE) y = (E — ë) Re = Rr = y, 


and tho first of inequalities (335) can be rewritten as ((A — AE) y, 4) < 0, 
and similarly, the second can be rewritten as ((A — uE)y, y) > 0. On subtract- 
ing the last from the previous inequality, we get ((u — A) y, y) < 0, ie. (u — 
— A)||y ||? < 0, whence it follows, since 4 < u, that y= 0, i.e. Rr = 0, 
and (332) is proved. We shall prove later the continuity of ë, from the right. 

We require an inequality before proving the integral form of the operator 
A in terms of ¢,. We bring in the projector 


A= (B>A; (336) 
we can write for any element z: 
((A — 4E) EAT, Az) < 0; ((4 — 4B) (E — ĉ,) 4x, Ax) > 0. 
On using the obvious equations 
=, ëA =(E— 84 =., 
we can rewrite our inequalities as 


((4 — uE) Az,2 <)0; ((4 — 1E) Az, £) > 0 
or 
A (4x, 2) < (A Az, x) < u (Ax, £). 


482 HILBERT SPACE [161 


This gives us, on taking any number v that satisfies A< v < p: 
|((A — »E) Az, ®) | < (u — A) (4x, x) 
or, since (Ax, x) = || Ax jl? < |l æ [3 
|((A — vB) Az, £) | < (uw — 2) || æ II. 


It is clear from this [126] that the norm of the operator (A — vE) A does not 
exceed (u — A), ie. 


| (A — vE) Az i| < (u — å) || æ ||. 


On replacing x in this inequality by Ax and noting that A? = A, we get an 
inequality fundamental for what follows: 


|| A de —v Axl] < (u — A) || 4z ||. (337) 


In this inequality, å < v < p, and A is defined by (336). We turn to the 
proof of (204). We take a positive number €, and split the interval [m — £p M] 
into sub-intervals: 


mM — Ep =A <A <A... < Ag <A, = M; 
then we introduce the projectors 4, = Ca, — ëa 


cis where 


n 
E= X Ay and A,4, = 0 for k Æl. (338) 
k=1 


Any element x can be expanded in mutually orthogonal terms: 


Ax = x Any. 
k=1 


It may easily be seen that 
(Azp, Ax) =0 and (Azp r) =0 for k #1. 
For instance, the first of these equations is proved thus: 
(Azp, Axı) = (A Aya, A Aix) = (A A Aye, Az), 
whilst the last expression vanishes, by (338). We now form 
n n 
Az — 5 vA = Y (Aap — ere), 
k=l k=1 
where vy is any value from the interval [A;_,, Ak]. The terms of the sum on the 


right are mutually orthogonal, and we can write, on using Pythagoras’ theorem: 


n n 
| Ae — X rel= SF || Arg — vezr |}. (339) 
k=l ka 


162] LINEAR OPERATORS IN |, 483 


Let ô be the greatest of the differences A, — A,_,. If we use inequality (337), 
we get from (339): 


n n 
[| Aa — X Aye li < 8S |i nell 
k=1 k=l 


or, by Pythagoras’ theorem: 


n 
|| Ae — X rA |? < & || eID, - 
k=1 


whence it follows that, for any element x, as ô— 0: 
n 
Ax = lim X », Aye. 
k=1 


We thus arrive at the basic formula 


M 
A= f idg. 


m 


It romains to show that &, is continuous from the right. As u tends to å, 
the projector 4 defined by (336) does not increase, and tends to some limit 44; 
we have to show that 4, is the annihilation operator. On passing to the limit in 
(337), we get (A — ABZ) Ax = 0. Hence it follows, in view of the second 
property of &,, that ë, 4,% = Aya, i.e. (E — &,) 4, x = 0. On the other hand, 
(E — &,) A = A, and we obtain on passing to the limit: (E — 81) 4) = dy. 
Since (E — &,) A, = 0, we get A,x = 0, i.e. A, in fact annihilates any ele- 
ment x. Notice also that any operator D that commutes with A also commutes 
with ë}. 


§ 2. Spaces l, and L, 


162. Linear operators in /,. We shall now apply the general theory 
to spaces J, and L, We have already seen that, by choosing some 
closed orthonormal system in H, a one-to-one correspondence is 
obtained between elements of the abstract space H and l. Naturally, 
l, can be regarded on its own as a concrete form of H, since all the 
axioms of H hold, given the usual definitions of algebra and of scalar 
product in l. 

The concept of cut-off element [cf. 184] must be introduced. Let 
z(o &, ...) be an element of l, and Li, TEETE T I R I PER 
have the same first k components as g, its remaining components 
being zero. The element x“ is called the cut-off of x. We have 


oo 


e—a] = SS |G |?>0 as ko, (1) 
m=k+1 


484 HILBERT SPACE {162 


ie. a >a as m—> œ. Let pı, pa .-. denote the base vectors 
of l, ie. € = 1 for gy, and the remaining components are zero. We 


have for an element 7: 


oo 


t= > En Pm: (2) 


m= 


L 


If A is a linear operator in l, and x’ = Az, we have, on introducing 
the components 2’(;, z, ...): 


= (z, Pn) = Sanm Ems (3) 
m=1 
(n = 1, 2, ) 
where 
m 7 {A Pm Pn) + (4) 


A linear operator in 7, can therefore be represented by a matrix 
with elements (4). The matrix corresponding to the conjugate operator 
A* is [cf. 134]: 


arm = (A* Pm» Pn) = (Pm APn) = Amn- (5) 
A self-conjugate operator is characterized by the equation 
Gam = Amn: (6) 


We have for a bilinear functional: 


> Aun injn= S S ên | Z tamia). (7) 


m=1 Pst n=l 


(Az, y) = (x, A* y => 


n=l 





where y has the components (M, 7, ..-). 
On forming the cut-off elements x“) and y'” for x and y, we have 


I k 
(Ax, y) = > > anm Em Nn- 


n=l] m=1 
But (Ax, y) + (Az, y) as k and l— œ, so that 


oo 


co l k 
SS Gnn na = lim > Stim in: (8) 
n=l] m=1 


n=l \m=1 
i+ 





If apq and 6,, are the elements of the matrices corresponding to 
operators A and B, the matrix for the operator D = BA is dpp 
defined by 


doq = (Dy, Pp) = (BAQ; Pp) = (Apg B* Gp); 


163] BOUNDED OPERATORS 485 


or, using the formula for a scalar product in l: 


oo 


(Agy, Ps) (B* Pp Ps Pp» Ps) = D4, 8%. (9) 
sol s=1 


s= 


On taking (5) into account, we finally obtain 


d,, = 


pi b a 


ps "sq* 


Me 


Wl 
= 


S 


If we use the same letters A and B for the infinite matrices as for 
the operators, and write {A},, and {B},, for the elements of these 
matrices, the above formula can be written as 


{BA} pq o> > {B}ps {4} (10) 


Given three linear bounded operators A, Band C, the associative law 
(C.B)A = C(BA) can be used to write the following formula for inter- 
changing summations: 


oo oo 


SCS (hos {Bh {Aha = 5 (lps (5 {B} {Ah aD 


ft=l sæl 8&=1 t=1 


163, Bounded operators. As we have seen, every bounded linear 
operator generates an infinite matrix @p,. Let us pose the converse 
problem: what sort of elements a,, must an infinite matrix have in 
order for (3) to yield a bounded linear operator in J,? We shall require 
that series (3) be convergent for any element (&, &,,...) of l, and 
that there exist an N such that 


> | = ünk Enl? < > lel. (12) 


for any element v € l. 
Remember that, in the case of a bounded operator A we must have 
for a bilinear functional: 


[4z y) | < N iiel iyl- 


On applying this inequality to the cut-off elements, the following 
necessary condition, containing finite sums, is obtained for the apq: 


I k k L 
|S Samna? <N S lên > il- (13) 
mæl n=l 


n=l m=1 


486 HILBERT SPACE [163 


This condition is in fact sufficient as well as necessary, i.e. the fol- 
lowing theorem holds: 

THEOREM 1. The necessary and sufficient condition for apq to be 
elements of the matrix of a linear bounded transformation is that, given 
any positive integers k and l, and any complex numbers Èm and nn, 
condition (13) is satisfied for some choice of the number N (independent 
of Ep, ng k and l). 

The necessity has been explained above. Let us prove the suffici- 
ency. Let (é,, é, ...) be any element of l. We put l =k in (13) 
and 


It now gives 


k k k k k 
(ZI Z trm En?) < N? lini S| È Anmén | 
n= m= m= n= m= 
or ; 


S| Semntn|t <¥* Sent 


n=l m=1 
and all the more 


k oo 
>| Sen bn p< Ne Sen. (14) 
m=l 


n=l m=l 


We shall show that these inequalities imply the convergence of 


> anmëm (15) 
mat 
(n =1,2,...) 
for any element of l}. Suppose that the series is divergent for some 
choice of element (EO, E, ...) and number n. In this case the 
series 
> lnm Sm; 
m=1 


is all the more divergent, and the finite sum of this series: 
k 
> lanm Sn 
m=1 


will increase indefinitely as k increases. We vary the arguments of 
the complex numbers £ in such a way that the products anm ze) 


163] BOUNDED OPERATORS 487 


are positive numbers. On applying inequality (14) to the element of 
i, thus obtained, and throwing away from the left-hand side all the 
terms except the one corresponding to the value of n mentioned, we 
obtain 


k °° 
( Saamsiny < N D ERI. 


The left-hand side increases indefinitely on indefinite increase of 
k, and we have arrived at a contradiction. Thus all the series (15) 
are in fact convergent for any element x. Let us now show that (14) 
implies the inequality 





k ao co 
D| Elmin pN S (16) 
n=l m=1 m=1 


In fact, if we were to have the reverse inequality for a certain k 
and a certain element x of l, we should have for this k, and sufficiently 
large J (it can obviously be assumed that ] > k): 


k l 
È| Z omnên > N È lint 


n=1 m= 
and all the more 


È| È antal >N È int, 


n=l m=1 


and this contradicts (14) with k = 1. Inequality (16) is thus proved. 
On letting k increase indefinitely in it, we arrive at (12), and the theo- 
rem is proved. 

Note. Notice that we only used (13) in the case l= k when 
proving its sufficiency. Let us show that it is sufficient to verify this 
condition only for quadratic forms, i.e. the sufficient condition for 
the operator to be bounded is that, for any k, 


k 2 k 
|X tim bm En| <N Z |n. (17) 


If we use (17) and the formula expressing a bilinear functional in 
terms of the corresponding quadratic form, we can write 


k 


ok k 
| È am Sn in| < [2] Em + tm |? H È En — ml? + 


m= 


k k 
A | Em + m|? + È | Em — im f] 


488 HILBERT SPACE [163 


Using |a + B |? < 2(j/a |? + |£), we obtain, on the assumption 
that the norms of x and y are unity: 


k 
| 2 amm n| < 4N, 
m,n=1 


whilst for elements with any norm: 


k k 1 k 1 
| È Gum Em Tint < N LS | En FUE | tm PP 
m,n=1l m=1 m=1 
i.e. condition (13) with 1 = follows from (17), i.e. the operator 
defined by the matrix anm must be bounded. Some further facts must 
be mentioned in connection with condition (13). If aj, satisfy condi- 
tion (13), the elements of the matrix of the conjugate operator Aj, = 
= ä,p obviously also satisfy this condition, as must in fact be the 
case in view of the general theory. We must also consider the matrix 
of the transposition operator and the matrix of the complex conjugate 


operator: 
{4} = gps {A}ng = tp: (18) 
We obviously have 
A* = (A)' =A’, (19) 


and the elements of the matrices of operators A’ and A obviously 
satisfy condition (13), provided it is satisfied by the elements of the 
basic matrix A. It follows at once from (13) that all the ap must be 
bounded in modulus by a number independent of p and q; in fact, 
on setting p = n = 1, and the remaining m and n equal to zero, 
we get | a&pa| < N. There is also a necessary condition that must 
be satisfied by the elements of the matrix of a bounded transformation. 
It follows from (4) that the aną are the components of the element 
Ag, The series formed by the squares of the moduli of the elements 
of any column must therefore be convergent: 


Z | an|? < +œ (20) 
n= 
(k=1, 2,...). 


On passing to A*, it will be seen that the same must be true as 
regards the rows: 


D | 21m |? < + œ (21) 
m=1 


(k=1, 2,...). 


163] BOUNDED OPERATORS 489 


Let us notice a simple sufficient condition for the transformation 
corresponding to the matrix anm to be bounded. 

THEOREM 2. If there exists a positive number l (not depending on 
m and n), such that the inequalities 


È lonm| <b (22) È lanm] <t (23) 


(n=1,2,...) (m=1, 2,...), 


are fulfilled, the matrix anm yields a bounded transformation. 
It is sufficient to show that, when || x || < 1 and || y|| < 1, the 
sum 


S= È 2 [4pm || Emil tn | (24) 
n=l m=1 
is bounded. In this case, (13) will all the more be satisfied. On observ- 
ing that |ab| < (la |? + |b |?), we can write 


1 


+s > fanm] (lëm? + Inal?) = 


n=l m=1 
A < < = z 
= PAL Z lanm] SAE = [mi S lann]. 


or, by (22) and (23), 


oo 


PA Eml? + T2 bral? = -y (lele + lal) <2 
=l 
and the theorem is proved. In the case of a self-conjugate matrix, 
(23) follows from (22). Notice that series (24) is not convergent for 
every bounded matrix. 
As an example, let us take the two infinite matrices: 


0, 0, 0,... a,, 1, 0, 0, 0,... 

1, 0, 0,... aa, 0, 1l, 0, 0,... 
A=|l0, 1, 0,... B= |a, 0, 0, 1, 0,.-. l> 

0, 0, 1,... a, 0, 0, 0, 1,... 


where the a, form a sequence of real numbers such that the series with 
common term | & | is convergent. Using Theorem 2, it is easily shown 
that linear transformations correspond to the matrices A and B. 
It is also easy to show, by using (9), that BA = E for any choice of 


490 HILBERT SPACE [164 


a, (satisfying the above condition). Thus a linear operator A has an 
infinite set of inverse bounded operators from the left, and hence no 
bounded inverse from the right. If we pass to the conjugate matrices, 
i.e. since the a, are real, simply to the transposes A’ and B’, we get 
A'B’ = E. The equation Ax = y has the form & = m; & = 13} .. 
and, given any y € h, it has a unique solution in },. The equation 
A’ x= y has the form ¢,= 7; & = ù; ..., and it has an infinite 
set of solutions, since &, is arbitrary. This is bound up with the fact 
that A’ has no bounded inverse from the left. 


164. Unitary matrices and projection matrices. Let us recall the 
fundamental property of a unitary transformation U: 


U*U =UU" =£. 


If up, are the elements of the matrix corresponding to U, these 
conditions can be written as 


D Ups Usg = Ôp Y Ups Uy = po (25) 
s=l s=1 


where ôp = 0 for p ¥ q and ôpp = 1. Noting that ufin = Unm, we can 
write these last equations as 


> Usp Ysg = Ôp; (26) 
S=1 
D Ups tqs = 8 pq) (27) 


l 
pas 


S 


i.e. the matrix w,, is found to be orthogonal with respect to its columns 
and rows. Notice that, in the case of finite matrices, conditions (27) 
are consequences of (26) [III; 28]. These conditions are independent 
in the case of infinite matrices. 

THEOREM 3. Conditions (26) and (27) are necessary and sufficient 
for the complex numbers up to form a matrix corresponding to a unitary 
transformation. 

The necessity of (26) and (27) follows from the above discussion. 
It remains to prove their sufficiency, i.e. we have to show that, given 
these conditions, a linear (bounded) transformation corresponds to 
the matrix up. The fact that this transformation is unitary follows 
afterwards, from the fact that conditions (26) and (27) are equivalent 
to (25), these latter being characteristic of a unitary transformation 
[137]. 


164] UNITARY MATRICES AND PROJECTION MATRICES 491 


Condition (27) shows that 


> ps)? = 1, 
s=l1 


so that the series 
Én = > Unk Sk (28) 
k=l 


are convergent for any element x [59]: 
We form the expression 


m n m n n = n n m A ae 
D | > “pq &P= > È “rtp ŠŠ = ( Uns Uy) $s $i- 
p=l q=1 p=ls=l1t= S=lt=1 p=l 


On letting m increase indefinitely, we find, by (26), that 


© n f n 
PAPAT PAE 
p=l q=1 q=1 

and all the more 

m n 4 oo 
PAP untl < 2|, 
p=l q=} q=1 


where m is any fixed finite number. This inequality still holds with 
n = œ, as follows by assuming the contrary, precisely as in the proof 
of Theorem 1 of the previous section. If we then let m increase in- 
definitely, we arrive at the inequality 


oo 


|S tno fal < Seal (29) 


q= 


Me: 


L 
= 


P 


which shows that the operator U is bounded. Notice that, since U is 
unitary, we must have the sign of equality in (29). 

Let us now consider the matrix p;,, corresponding to a projector 
P onto subspace L. On recalling that P is a self-conjugate operator, 
and that P? = P, we get the following conditions: 


Pri = Pir = Pis Psk = Pix - (30) 


It can be shown, precisely as above for unitary transformations, 
that these conditions are sufficient as well as necessary for the matrix 
Pix to correspond to a projector. We choose a closed orthonormal 


492 HILBERT SPACE [165 


system so that part of the base vectors form a closed system in the 
subspace L, and the other part a closed system in the complementary 
subspace H — L. The former base vectors remain unchanged as a 
result of application of the projector P, whilst the others are annihil- 
ated. Thus, given the chosen system of base vectors, the projector P 
corresponds to a purely diagonal matrix, the principal diagonal of 
which consists only of ones and zeros. In other words, the matrix of 
a projection is the unitary equivalent of a purely diagonal matrix, 
the principal diagonal of which consists only of ones and zeros. 


165. Self-conjugate matrices, A self-conjugate matrix A is charac- 
terized by conditions (13) and (6). The eigenvalues and eigenelements 
of such matrices are defined from the condition that the infinite system 


= dis E = 26; (31) 
(¢=1,2,...) 


has non-zero solutions in J,. If the eigenelements y, form a closed 
orthonormal system, and they are taken as the base vectors, the 
matrix corresponding to the operator A will have the elements 


0 for pq, 


32 
A, for p=4q, (32) 


Ang = (Ayy, Pp) = Yo Pp) = 
i.e. we get a purely diagonal matrix with the numbers Ap on the 
principal diagonal. In general, the necessary and sufficient condition 
for a self-conjugate matrix to have a purely point spectrum is that it 


be the unitary equivalent of a purely diagonal matrix. In the above 
case, when the yp are the base vectors, we have 


(Aa, y) = 34,855 (Ax, £) = = S460 (33) 


In the general case, given a self-conjugate matrix aix, there exists 
a resolution of the identity %,, i.e. a non-decreasing projection matrix 


li(A), such that 


0 for i#k, 
l(a) = 0; lb) = 1 for ¿=k (34) 


(i,k =1, 2,...); 


165) SELF-CONJUGATE MATRICES 493 


and we have the formulae: 


oo b b oo 
i= Zast = (Az, p;) =i 1d(g, £, p) = l ad ( PARC é), 


i.e. 
b 


= J hdl,,(A) (35) 


a 
and 


(Ax, y) = faal, 2, lalh) &; is); 
(36) 


(Az, x) =f ad ( = L(A) EE). 


We have here, by the properties of resolution of the identity 


oo oo 


2 Tig(A;) 15,(Ag) = È lilha) lorlh) T Liz (A,) for A, < Ay (37) 


and in general 


co 


> A, 1,,(a) : A, llà) = A, A, lilà) , (38) 


where the right-hand side represents the difference in the values of 
lix(å) at the ends of the interval 4, A,, this latter being the intersection 
of the intervals 4, and 4,. If f(A) is a non-decreasing function in the 
interval [a, b], the operator f(A) corresponds to a matrix with elements 


(HAN n= f 12) dla (A). (39) 
We can write [43]: 
> fh A) dl, (å yf fa (A) dg, (A) = j fa (à) fa (A) dl (A). (40) 
Noticing that the bilinear form 
(Z, T, v) = PRS (A) E, is (41) 


is a function of bounded variation of 2, we can say that the functions 
I(A) are of bounded variation in å. When y = g, (41) yields a non- 
decreasing function of å, and it follows at once from this that the 
functions /,,(A) are not decreasing. If (39) are understood as Lebesgue— 
Stieltjes integrals, this formula is applicable for the wide class of 


494 HILBERT SPACE {166 


functions f(å), which we indicated in [155]. It is sufficient to assume 
that f(A) is bounded and is a B function [47]. In this case it will be 
measurable with respect to any non-decreasing function. In the case 
of a purely continuous spectrum, all the functions 1;,(A) are continuous. 
The converse is also true. In the case of a mixed spectrum, we consider 
the subspace Z, in which the operator A has a purely point spectrum, 
and the complementary subspace H — L, in which A has a purely 
continuous spectrum. Let us introduce closed orthonormal systems 
into these subspaces. On writing (#1, &, ...) for the elements in L, 
and (¢7, &, ...) for the elements in H — L, we can write the bilinear 
and quadratic forms as 


oo 


b a 
D Anti = Sneek + § Ad( =," A) ër N's 
k a t=1 


i,k=1 
oo 2 b 

2 Gin Erki = SA, EP + f Ad ( Ps A) Er Es) 
k=l k a t=1 


where the J,,(A) are continuous. 
Let us consider the resolvent of the matrix A, i.e. the matrix with 
elements {R(A)}i,, defined by 


{R( A)}in = {(A — Ey- Vii. 


(42) 


We have 
b 
(B Aja = [F , (43) 


on the assumption that 4 does not belong to the spectrum of A. 
Notice that, by (39), positive integral powers of A can be written as 


{A} = ( A” dli (å). (44) 


If A = 0 does not belong to the spectrum of A, i.e. all the 1,(A) are 
constant in some neighbourhood of 4 = 0, there exists a bounded 
inverse matrix A~!, and powers of it are given by 


b 
{An} = FAM dli (A). (45) 
166. The case of a continuous spectrum. As we know, a further 


dissection of H — L can be carried out, into subspaces invariant 
with respect to A, in each of which A has a simple continuous spectrum. 


166] THE CASE OF A CONTINUOUS SPECTRUM 495 


Let H, be a subspace of this type. We introduce a closed orthogonal 
system into it and take our future remarks to refer to H,; we can 
regard H, as a Hilbert space, into which the base vectors 9, Po, ..- 
are introduced, so that every element x is defined by its components 
(Ep &, -.-). Let x be an element of H,, such that the closed linear 
envelope of %, x is the whole of H,, where a < å < b. Let p,(A) denote 
the components of the element 2,2. Any element y(7,, M, ...) of Hı 
is associated with the function 





Py (A) = (y, F, 2) = DAE (46) 


kal 


Using formula (259) of [147], we can write a bilinear functional in 
the form 


oo = d = Pr Px (A) nm A Xori (4); 
(Ay, z) pe Qin Ng È = A doy ? 


i, k= 





a 
where 


=|| 8; z|? = Sip. ®, (47) 


so that, given our system of base vectors, the elements of the matrix 
defining the transformation A in H, are 


_dp,(A) dp,(4) 


Further, given any element y of H,, there is a corresponding function 
y(A) of L, with respect to ọ(å) in the interval [a, b] such that 


oo Aa 
= È Pid) me = J yiu) dou) (49) 


and conversely, given any function y(A) of L,, there is a corresponding 
definite element y of H,. Norms and scalar products are preserved in 
this correspondence. If we let g,(A) denote the function y(A) correspond- 
ing to the base vector px, we obtain 





a 
Bult) = f plp) dolu), (50) 


and the (å) (k= 1, 2, ...) form a closed orthonormal system 


496 HILBERT SPACE [166 


in Z,. The operator A in H, corresponds to multiplication by å in L,, 
and instead of (48), we can write for a;, = (Apr, pi): 


b 
a= J Ap, (A) pA) delà). (51) 


Suppose that we took, instead of the g,(A), a complete ortho- 
normal system (å) in L,, with some complete system of base vectors 
Yr in H, corresponding to these y,(A) (k = 1, 2, ...). Let us introduce 
a unitary transformation U into H,, transforming the base vectors 9; 
into base vectors Yx, ie. Ug, = px. This unitary transformation U 
in H, will correspond to some matrix, which itself depends on the 
choice of base vectors. If we choose the y, or y, as base vectors, we 
get the same matrix with elements 


Cir = (U Pro Pi) = (Yur Yi) » 
or 


Cir = (OP Yi) = (Po U* yi) = (Pe UT Y) = (Pe pi)» 


Since passage from H, to L, does not change the scalar product, we 
can write 


cg = f Px(A) P(A) de(å) . (52) 


In the new base vectors, the elements of the matrix that corresponds 
to the operator A will be defined by formulae analogous to (51): 


b — 
bin = J Ay,(A) p,(A) do(A) . (53) 


If £, (k = 1, 2, ...) are the components of an element in the base 
vectors pp, and j are the components of the same element in base 
vectors Pp, then Ek = (x, px) and Èk = (a, px) = (x, Upi) = (Ua, Pr), 
whence it is clear that (£i, £3, ...) is expressible in terms of (¢,, &, ...) 
with the aid of the inverse matrix to cix. Thus, if we write A, B and C 
for the matrices with elements aik, bj, and ci, we have the matrix 
equation 

B = 0 AC. (54) 

Using formula (263) of [147], together with (49) and (50), we can 


write 


{Fin = = lu) = J P(A) P (A) dol) . (55) 


166] THE CASE OF A CONTINUOUS SPEOTRUM 497 


On recalling (39), we can write the expression for the elements of 
the matrix f(A) (in base vectors px), where f(A) is any bounded B 
function, defined in the interval [a, b]: 


b 


Abe = S f(A) A) PA) delà). (56) 


a 


If f(A) = 1/(A — u), we get the resolvent of the operator: 


b 


A) pA) 
(Rw) = | AORA dota). (57) 
a 
If /(A) is a real function, f(A) is a self-conjugate operator, and we can 
write expressions similar to (55) for the elements of the spectral 
function &;, of the operator f(A): 


{Z} = J x4) P(A) do(A) , (58) 
H 


where O, is the set of values of 2, defined by f(A) < u. We shall not 
dwell on the proof of this formula. The nature of the spectrum of f(A) 
will vary, depending on the properties of f(A). 

We started above from a given operator A, self-conjugate in H, 
and a given element x such that the closed linear envelope of 7, x 
is H,. By introducing the base vectors pp, we arrived at l, and infinite 
matrices for which the above formulae hold. Conversely, we can choose 
any continuous function (A), non-decreasing in the interval [a, b], 
and vanishing at à =a, and a closed orthonormal system 9;(A). 
After this, formulae (51) define elements a; that obviously satisfy the 
condition aj, = agi: It may soon be shown that the matrix with 
elements @;, corresponds to a bounded operator in l. For, on writing 
N for the greatest value of | À | in the interval [a, b], we have, by (51), 

m 6 m 
| PA Qik Eé: | <N f(| © Pr (A) E, |? de (A), 

ake 


i, keel keel 


or, since the (A) are orthogonal and normalized: 


| D arék é,| <N Y lel, 
ik= kal 


498 HILBERT SPACE (166 


whence it follows that the corresponding operator is bounded. It is 
self-conjugate, since a, = a,;. Formulae (55) define the elements of a 
projection operator which depends on the parameter u and is a 
resolution of the identity, where obviously, 


b 
lik = f Ad {E} 


i.e. &, is the spectral function of the operator A. If we pass to the con- 
jugates in (50), we get the components p,(4) of the element 2, x, and 
when å = b, the components of the element xv itself. In view of the 
closure equation, it follows from (50) that (A) is given by (47). In the 
general case of a self-conjugate operator A with continuous spectrum, 
we form pair-wise orthogonal invariant subspaces H,, H,, ... in each 
of which A has a simple spectrum. If we bring in the base vectors for 
each A,, we obtain formulae of the above type for each H,. The final 
expression, say for the bilinear form (Az, y), can then be obtained by 
addition of the bilinear forms in each of the Hy. 

The concept of simple spectrum can readily be generalized, by 
renouncing the requirement that the spectral function @, be con- 
tinuous. But there must exist, as before, an element x such that 2, x 
forms the whole of H. The non-decreasing function (A) given by (47) 
is not necessarily continuous in this case. We can obviously take x to be 
a normalized element so that o(b) = 1 as well as ọ(a)= 0. If A has, say, 
a purely point spectrum and all the eigenvalues have rank equal to 
one, on taking 2 to be any element such that all its Fourier coefficients 
with respect to the closed system of eigenelements differ from zero, 
we can assert that Z, x forms the whole of H, and the spectrum in 
question will be simple. It may easily be seen that the spectrum cannot 
be simple when multiple eigenvalues exist. We get the general case 
of a simple spectrum if, on splitting the whole of H into a subspace of 
eigenelements and a subspace with a continuous spectrum, simple 
spectra are obtained in both cases. The spectrum will be simple in the 
first subspace when and only when all the eigenvalues are simple. 
If a simple spectrum is not continuous, &, has jumps, and the (A) 
given by (47) must also have discontinuities at the points where @,; 
is discontinuous. For, if 9(A) were continuous at a point A = 4’ where 
the spectral function %, is discontinuous, we should have (2; — #4_o)x 
= 0, and all the elements of the space formed by Zıx would be 
orthogonal to the eigenelements corresponding to the eigenvalue 
A = 2’, whence it would follow that Z, x cannot form the whole of H. 


167] JACOBIAN MATRICES 499 


167. Jacobian matrices. Let A be a self-conjugate operator, having 
a simple continuous spectrum, in infinite-dimensional space H. On 
orthogonalizing powers of A with respect to (A) in the interval [a, b], 
we obtain a system of real polynomials P,(A) (k = 0,1, 2,...) of 
degrees k: 

[RO Pyaar) = [0 Or ee (59) 
- 1 for i= k, 
as the closed system 9;(4) of the previous section. 

The first coefficient in each P,(A) can be assumed positive. In the 
previous notation we enumerated the 9;(A) starting from k = 1. We 
now enumerate the P,(A) starting with k = 0, since k denotes the 
degree of P,(A). Thus the P,(A) replace the 9;.4,(A). Given our choice 
of base vectors, the elements of the matrix corresponding to the 
operator A are defined, in accordance with (51), by 


b 
Qik = f AP; (A) Pp (A) de (A) (60) 


(ik =0, 1, 2,...). 


Let k — i > 1. The product AP;(A) is now a polynomial of degree 
lower than k; this product is linearly expressible in terms of P,(A) for 
s < k, and, by (59), integral (60) is now equal to zero. Similarly, it is 
equal to zero for i — k > 1, since ay; = aik, i.e. aj, = 0 for |i — k |> 0. 
Let us introduce the notation: 

b 


b 
a, = § AP3(A)dolå); by = J AP (A) Pral) delh) (61) 


a 
(k= 0,1, 2,...). 
The number bx appears in the linear form of the product AP;({A) 
in terms of På) (s = 0,1, 2,...,#+ 1): 
k 
APA) = by Praal) + È P Pd), (62) 
s= 
and bx > 0, since the first coefficients of the P,,(A) are positive. It 
follows from (60) and what has been said that 
Akk = Ak; Ok kti = Ortik = bri (63) 
ap =0 for |i—k|>1, 


and the matrix of the transformation therefore has the form, given 


500 HILBERT SPACE [167 
our choice of base vectors: 


5000... 

ba 5,0 0... 

G Boab 0 tow, (64) 

0 0 b, a, bz.. 
where b > 0. A real self-conjugate matrix that satisfies conditions 
(63) is described as Jacobian. Therefore, given a suitable choice of 
base vectors, the matrix of a self-conjugate operator with a simple 
continuous spectrum is a Jacobian matrix. 

The coefficients in expansion (62) are readily calculated by using 
(59) and notation (61), by multiplying both sides of (62) by P(A)d e(A) 
and integrating with respect to A. When m < k — 1, the integral of 
AP,{A)Pm(A)d (A) vanishes, as we have seen, and hence it follows that 
the c“) = 0 for m < k — 1. We use notation (61) when calculating 
the remaining coefficients, and arrive at the following relationships 
between the polynomials P,,(A): 


AP (A) = by Pptilå) + ak Plh) + by) Py_alA) > (65) 
where 
P_,(2)=0; P,(d)=1. (66) 


The last equation is due to the fact that we can assume ọ(b) = 1, as 
mentioned above, whilst the first is regarded as a definition. As in the 
previous section, it is also possible to start from a continuous, non- 
decreasing function (4), form a system of polynomials, orthogonal 
with respect to (A), and elements of a Jacobian matrix in accordance 
with (60). By what was said in [163], the elements of matrix (64) must 
be bounded in absolute value by the same number. This is also easily 
proved from (61). The above arguments lead to the result: 

THEOREM 4. Every self-conjugate matrix, corresponding to a bounded 
operator with a simple continuous spectrum, is the unitary equivalent 
of some Jacobian matrix of type (64) with bounded elements and bg > 0. 
We can obtain all such matrices from (60), where [a, b] ts any finite inter- 
val, (A) is a non-decreasing function which is continuous in this interval 
and is subject to the conditions o(a) = 0 and 0(b) = 1 (the latter is not 
important), and P,(4) form a system of polynomials, orthogonal and 
normalized with respect to o(A). 

We shall start from a given Jacobian matrix, on the assumption 
that the elements ai = ap; for |i — k | = 1 and differ from zero. If 


167] JACOBIAN MATRICES 501 


we pass from the initial system of base vectors to a new system, by 
multiplying each vector by an expression of the form et, it may 
easily be seen that, given a suitable choice of wx, we get a Jacobian 
matrix that is the unitary equivalent of the given matrix, and is such 
that the elements aix are positive for |i — k | = 1. We can therefore 
assume that the given Jacobian matrix has the form (64), where the a, 
are obviously real and bp > 0. On recalling Theorem 2 of [163], and 
one of the corollaries of Theorem 1, we can say that the necessary 
and sufficient condition for matrix (64) to effect a linear bounded 
transformation is that the numbers a; and b; be bounded by the same 
number N, independent of k: 


[a| <N; || <X, (67) 
We shall assume this in future. Let 
Yor Pir Yor --> (68) 


be the fundamental system of base vectors. Let A be the self-conjugate 
operator corresponding to matrix (64); we can write 


Apr = bya Pama + OPH OK Pets (k=0, 1,...; Y= 0). (69) 

If we bring in the polynomials P,(A) defined by (65) and (66), we 
can use (66) to express any yp directly in terms of y, by 

Pk = PKA) Yo - (70) 


Let %, be the spectral function of the operator A, i.e. of matrix (64). 
It follows at once from (70) that the elements 2, Y, form the whole 
of H. For, by (70), 

b 

Pk = f Pk (A) do, Yo, 

a 
where a and b are the bounds of operator (64), i.e. yx is a limit of linear 
combinations of elements Z, yo, and every element can be expanded 
in base vectors p. A simple (not necessarily continuous) spectrum 
thus corresponds to a Jacobian matrix, and the role of basic element x 
can be played by the first base vector Y. We can write, on the basis 
of (70): 


(Pi Pe) = (Pi (A) Yo, Px (A) Yo) =(P;(A) P(A) Yo. Yo); 
(Ay;, P) =(AP; (A) Px (A) Pos Yo)» 


and, on introducing the function 


elâ) = (F, Yo: Yo) = |Z Poll”, (71) 


502 HILBERT SPACE [168 


we obtain from these equations: 
b 0 for ik 
iVa = ù Pi (A) P, (A) de (A) = f 
(vine) = Pi) Pade (2) =] Or IEE? 
b 
(AW: Pe) = f AP; (A) Pp (A) do (A), 
a 


whence it follows immediately that the polynomials P,A) form an 
orthonormal system with respect to (A), and that the elements 
of matrix (64) are expressible in terms of them in accordance with (60). 
If the function @, has discontinuities, an eigenvalue with rank unity 
corresponds to each such discontinuity, as we saw in the previous 
section. It is clear from this that 2, cannot reduce to a finite number of 
jumps, and the same can be said regarding (A). Conversely, any Jacob- 
ian matrix can be constructed from (60), by choosing as ọ(2) any non- 
decreasing (not necessarily continuous) function, provided only that 
it does not reduce to a finite number of jumps. 


168. Differential solutions. Let us take a self-conjugate operator 
(matrix) with purely continuous spectrum. As we have seen, a sequence 
of mutually orthogonal normalized elements y® (s = 1, 2, ...) can be 
formed, such that the %,y® form mutually orthogonal subspaces 
H,, the orthogonal sum of which is the whole of H. The number of 
elements y® may be either finite or infinite. Let p(A) (k = 1, 2, ...) 
be the components of the element %,y®.The functions pf)(4) are of 
bounded variation, and, given any s in any interval contained in 
[a, b], they satisfy the equations 


Say, Ap (a) = f Adp® (2) (72) 
k=l 4 


(= 1,2...) 


The following orthogonal properties of the solutions p}(A) can be 
proved [151]: 
5 41 pP (å) - 2 pP (A) = 0 (73) 
k=1 


(s Æ t; intervals 4, and A, arbitrary), 


> A, pi? (A) - A, pË (A) =0 (73,) 
k=1 


(4 and A, have no common interior points). 


168] DIFFERENTIAL SOLUTIONS 503 


The fundamental formulae of [149] yield the following equations, 
containing the differential solutions: 





* app Aap? (2) 0 (k £i) 
e = | i (74) 


des (4) 1 (k=%), 


a 


“dp? (A) dpf? (a) 


Qik = -3/ “Tape ne (75) 


dps) (u) (s) 
(u) dp (u) 
=2 E Tadge (76) 


where l(å) are the elements of the spectral matrix. If solutions of 
system (72), satisfying the orthogonality conditions (73) and (73,), 
have been obtained in some manner, and (76) has been proved for 
any A, we can be sure that the system of solutions obtained is complete, 
and that the remaining formulae hold. Let y and z be elements of l, and 


YP) = Sup; O@) = S420 H 
k=l 


k=l 


Formula (272) of [149] can be written as 


dy((A) dz) 
(Ay, 2) =2 f A =eag (76;) 


It is an immediate consequence of (75). The remaining formulae of 
[149] may be written similarly. It is easily shown that, if a differential 
solution v,(A) (k = 1, 2, ...) is orthogonal to all the above-mentioned 
differential solutions that form a complete system, all the v,(A) are 
constant. The case of constant v,(A) yields a trivial solution of system 
(72), since now Av,(A) = 0 and dv,(A) = 0. The differential solutions 

p}(A) obtained after separating out the point spectrum, are obviously 
orthogonal to all the eigenelements of the operator A. Let us return to 
system (72), and let p,(A) (k = 1,2,3,...) be some differential 
solution of this system: 


2 Qir Ap, (A) = J Adp,(A) 


C= 2 2,...). 


504 HILBERT SPACE [169 


Suppose that all the functions p,(4) have continuous derivatives 
pi(A). The Stieltjes integrals written now reduce to ordinary integrals 
of continuous functions, and, if we apply the mean value theorem to 
them, and write A’ and 4” for the ends of the interval A, we obtain 


PATENTE — D{A')] = 4; pili) (A" — 2’), (77) 
where 4’ < A; < å”. On applying Lagrange’s formula to the left-hand 


side, dividing both sides by (A” — 4’), and letting 2’ and A” tend to 
the common limit 2, we obtain 


2 Qin PEA) = Api(A) (78) 
(¢=1,2,...). 


It is assumed here that we can pass to the limit term by term in the 
infinite sum on the left-hand side of (77). This will certainly be the 
case if the sum is finite, i.e. if the matrix a; has only a finite number 
of non-zero elements in each row, and hence in each column. It will be 
seen from (78) that, given these conditions, in the case of a continuous 
spectrum, the pj(A) (k = 1, 2, ...) satisfy for any 4 the same equations 
as we had for the eigenelements; but the pj{(A) do not belong to J, i.e. 
the sum, formed from | p;(A) |?, is equal to -++°°, since no eigenelements 
exist in the case of a purely continuous spectrum. In the case of a 
mixed spectrum, the differential solutions can be added to the ordinary 
solutions of l}, that yield the eigenelements. 


169. Examples. 1, Suppose that, in the interval [~1, +1]: 





A 
o(A) =2 [ visas. 
-1 


Condition (59) for the polynomials P,(A) becomes 


0 fori sk, 


l for ¿=k. (19) 


+1 
Z f V1 — 2? P(A) P,(a) då = { 
-l1 


It may easily be shown that thəse conditions are satisfied by the poly- 
nomials 
P, (a) = aan E e , where cos 0 =À. 
sin 0 
We can easily show, by using de Moivre’s formula, that the fraction here is 


actually a polynomial of degree n in cos 6. Condition (79) may be verified by 


169] EXAMPLES 505 


direct substitution, if we change the variable by putting A = cos 6. The num- 
bers a, and by appearing in matrix (64) are defined by 


+1 +1 
y= f AVI—F PRA) da; = = f AYIE P,(A) Peal) da. 
—1 —l1 


On bringing in the variable 6 and evaluating the resulting integrals, we find 
that a, = 0 and 6, = 1/2 for any k, ie. the elements of the corresponding 
matrix are given by ay, p41 = Qrp = 1/2, whilst the remaining ajy = 0. This 
matrix has a simple purely continuous spectrum. The unique differential 
solution of the system is given, in accordance with (50), by 


a @ 
p,(A) =2 [vi Hi P,«(2) dd = — = [sin dsinn odo, 
Pa | a 


whence p (å) = (2/z) Y1 — 2 P,_, (2) = (2/7) sin n0. On throwing away the 
factor 2/n, system (34): 
1 1 1 1 1 
zu Any; p tit ys = At; es gimt g Sa = Ata ws 
is seen to have the solution a, = sin (n arc cos å), where —1 < À < +1. 
2. Let us consider the closed, orthonormal system (1//2z) e* (k = 0, +1, 
+2, ...)in the interval [— x, +2]. Taking 0(A) = A, (51) gives us the follow- 
ing elements for the corresponding matrix: 
l +n ( 1p 
Sao §-Ma ga = -t 
Oy = f jel P-Oa.ga a 


a for and app = 0, 
i P#4 pp 


-n 
where the subscripts p and q run from (— œ) to (+œ). In accordance with 
(50): 


a 
1 1 
a= fedi py() =— o7, 
Pr (A) Ak 5 Pk (A) = 
-7 


and (78) lead to the equations 


co —k 
> ae ee ee ae 
iawn Tk) On 2x 


tsa 
? 


where the prime on the summation sign indicates that we have to exclude 
k=ae. 
The last equation can be rewritten as 


oo 


(= saya 
2 368 =4, 


km —co 
or as 

So ne, 

ij d 


jaro 


506 HILBERT SPAOE [169 


the value j = 0 being excluded. This equation is an ordinary Fourier expansion 
of 4, the series being divergent at the ends of the interval, i.e. for A = 4 x. 
This last is due to the fact that the expansion is written in the complex form. 
Let us now apply (56), putting f(4) = —x — A for A <0 and f(A) =a —A 
for A> 0. We obtain the matrix 





0 n 
ba = oe f (—a— Ajoi P-D ga + = f (a — Ay ef P94 ga, 
a. 0 
or 
bpa = 5g for p#q and bpp = 0. (80) 


Since | f(A) | < z in the interval [— x, +2], we have the following bound for 
the quadratic form [155]: 








+t Epa Í +t 
D pqi s7 D lhl’ (81) 
p,q=—-s =a 


The factor 7 on the left of this inequality can obviously be thrown away. 
If we put é, = 0 for p < 0 and the remaining ¢ņ are real, we get the following 
inequality, given by Hilbert: 


n 
<a> Ej. (82) 








It may easily be shown that the matrix with elements ang = 1/(p — q) when 
P #9, and app = 0, no longer corresponds to a bounded operator. For, if we 
put k = l//n for 1 < k < n and & = 0 for k > n, the norm of the element 
(41, v’ a ...) will be equal to unity, whilst the corresponding quadratic form 
becomes 





A _ 2g" LO a2 e11 adh 2 )- 
a jp- a = za el lrtet+z=at eo) 





ge oe er 
=2(l+54--+5-4-7S), 


and this last expression increases indefinitely as increases, since the sum 
1+ 142+... + 1/(n — 1) increases indefinitely, whilst the fraction (n — 1)/n 
tends to unity. In case (80) we do not have absolute convergence of the infinite 
double series, but we can say that, for any two elements of l, the limit exists: 


ue Song . wl wa & 
im > 2p page > p— g“ i} Mg » 


moo p=1 q=1 q=} L p=l1 





the terms on the left for which p = q, and the value p = q in the inner summa- 
tion over p, being excluded. 


170] WEAK CONVERGENCE IN 4 507 


3. Let us now take, instead of e!K4, the real closed orthonormal system 


Pr (A) = Fae (in kA + cos ka) 
(e =0,4+1,+2,...). 
On applying (56), we arrive at the matrix 
+a 
bpg = a i f (A) (sin p å + cos pA) (sing 4 + cos q å) da, 


-n 
or 


+n 
bpg = gy | H(A) cos (p— a) Ada-+ gy f Asing g)adi. 


On putting, as before, f(A) = —n — å for 4 <0 and f(4) = x — À for 
A> 0, we get the following matrix: 





1 
by apa eae Cand bpg = 0 for p+q=0. 


We arrive at the inequality, analogous to (81): 


+t tok, 
> 2d 
P.qQ=—s P + q 


+t 
<2 Xlib (83) 
k=-s 








where the prime indicates that we have to exclude the terms for which p + q =0. 
The inequality corresponding to (82) is 


n 
pég 
> ~1,|< 
pane +a 


n 
DAA (84) 


k=l 








All the numbers can be assumed positive, and this double series is absolutely 
convergent for any element of l, so that we can write: 


> Peca 2 ëk. (85) 


P,q=1 


170. Weak convergence in l, Let sE, &,...) (n = 1,2,...) 
be a sequence of elements weakly ponverpent to the element 2(€,, &, 
..), ie. 2 => x. This can be written as 


|E — EPPO as n>. (86) 
kml 
It follows from this that 
PA AEPA (87) 


and the || 2) || are bounded by the number J (independent of n). 


508 HILBERT SPACE [171 
It follows at once from (86) that 


HO > Ero (88) 
(k=1, 2,...), 


but (86) does not follow from (88). 
Let us show that condition (88), together with the condition that 
the norms of elements 2” be bounded: 


Šp., (89) 
k=l 
is equivalent to the weak convergence a”) “, x. 

If we have weak convergence, the || 2 || must be bounded, i.e. 
(89) must hold for some choice of l and, in addition, we must have 
(a, Px) —> (£, pk) ;where gy, are the base vectors mentioned pre- 
viously, whilst this leads to (88). 

Conversely, if conditions (88) and (89) are fulfilled, the weak con- 
vergence of 2” to x follows at once from what was said in [132]. 

We can therefore state the theorem: 

THEOREM. Conditions (88) and (89) are necessary and sufficient for 
the existence of a weak limit of the sequence of elements x(&, E, ...), 
and if they are fulfilled, the limiting element has the components (&,, č, 


171, Completely continuous operators in l. We have already 
obtained [108] a sufficient condition for an infinite matrix to define 
a completely continuous operator in l, viz. if the double series 


> [anml (90) 
n,m=l 
is convergent, the matrix anm defines a completely continuous operator 
in l. 

The convergence of series (90) is merely sufficient for the operator 
defined by the matrix anm to be completely continuous. It can be 
shown that the necessary and sufficient condition is that the passage 
to the limit indicated in (8) should hold uniformly for æ and y, the 
norms of which do not exceed unity. The equation (E — uA)z = y 
has the form in J, 

EnH > anmém = m (91) 


m=1 


172) INTEGRAL OPERATORS IN L, 509 


where (m, n» .-.) is the given and (&, &, ...) is the required element 
of l. If A is a completely continuous operator, everything said in [135] 
holds for system (91). Let A be a self-conjugate completely continuous 
operator and yx (k = 1, 2, ...) be a complete orthonormal system of 
its elements. Let U be the unitary operator in /, defined by the con- 
ditions y, = Ugg, where pp are the previous base vectors in J,. If we 
take the y;, as the new base vectors in l, the operator A now becomes 
B=UAU-, Its components are given by {B}nm = (BYym Yn) = 
= AnlYm Yn), Since Bym = Am Ym. 
Consequently, 


(B) _ [îm for m=n, (92) 
we 0 for m#n, 


i.e. a diagonal matrix corresponds to the operator B in the base vectors 
Yp the diagonal being composed of the eigenvalues of the operator. 
This remains true for any linear self-conjugate or unitary operator 
with purely point spectrum [146]. 

The operator A is the unitary equivalent of B, in fact A = U-1 BU. 
In view of what has been said, we can assert that the matrix correspond- 
ing to a completely continuous self-conjugate operator in l, is the 
unitary equivalent of a diagonal matrix in which the diagonal elements 
Am satisfy the conditions given in [136]. 


172, Integral operators in L,. We have already considered integral 
operators in Lp. Let us now consider them in more detail in Z,: 


b 
p (x) = | K (x,y) f (y) dy, (93) 


where K(x, y) is a measurable function in the interval 4, (a < x < b; 
a < y < b), and hence is measurable with respect to y for almost all x 
of [a, b], and vice versa. Suppose further that it belongs to L, as a 
function of y for almost all x, and vice versa, i.e. 


b 
K? (x) = f | K (æ, y)dy < +0, (94) 


b 
K? (y) = § | K (x,y) P dx < + ©, (95) 


where K(x) and K,(y) are measurable non-negative functions [67, 68]. 
It follows from (94) that, given any f(y) € Lz integral (93) exists for 
almost all x, and ¢(z) is a measurable function [67, 68]. The necessary 
and sufficient condition for (93) to be a linear bounded transformation 


510 HILBERT SPACE {173 


when (94) holds is that, given any f(x) of L, there exists a positive 
number NW such that 


b b b b 
flp (a) Pde = f | J K (z, y) f (y) dy |? dz < N? f IF (y) |? dy, (96) 


There is a simple sufficient condition for the operator corresponding 
to the kernel K(x, y) to be bounded, precisely analogous to the con- 
dition for the matrix to be bounded: a positive number } must exist, 
such that 


JIK (y)dy < <l and JIK æn lde < (97) 
It is sufficient to show that the corresponding bilinear functional 
must be bounded. On replacing all the functions by their moduli in 


the iterated integral expressing the functional, the iterated can be 
replaced by a double integral: 


b b 
f JTE (e, y) Ilf 0) Il fe (@) [dady < 
l bb 


<= [J lZ@ MILAM E+ 1A @ P] dedy = 


b 


b 
3 + S [SIE enar] o) )|2dy + 


+7 L fT TIK eniaint z) Pdx < 


b 


< x| fih (y) dy + { tracey] ; 


But this last expression is equal to J, if ||/,|| = || f, || =1. By using an 
exactly similar method of proof, a more general sufficient condition 
can be given for operator (93) to be bounded, viz. there exist a 
positive number / and a positive function w(x), continuous in [a, b], 
such that 


b 
J | K (a, y) | (y)dy < lolz); JIK enoa) dz < læ(y). (98) 


173. The conjugate operator, In the case of a bounded operator, 
no meaning may attach to the integral 


fK (x) t (£) dx, (99) 


173] THE CONJUGATE OPERATOR 511 


where the non-negative function K(x) is given by (94), for certain t(x) 
of L,. The set of the t(x) of L, for which it has a meaning is obviously 
a lineal 1 in Z,. 

THEOREM 1. The lineal l is everywhere dense in Ly. 

We have to show that the closure of J gives the whole of L}. If this 
were not the case, there would exist a non-zero element a(x) of L, 
orthogonal to the subspace formed by the closure of l, and hence to all 
the t(x) of l. It is therefore sufficient for us to show that, if a(x) = 
= m(x) + im(x) is orthogonal to all the t(x) of l, i.e. 


fr (x) x(x) dx = 0, (100) 
a 
n(x) must be equivalent to zero. We choose t(x) of lin a special way. 
Let m be any given finite positive number, em the set of the x for which 
K(x) < m and ep any part of em of measure < m. We define t(x) so 
that t(x) = 1 if x € em, and t(x) = 0 for other x. This t(x) belongs 
to l, and we obtain on applying (100): 


J pae y Fut) — in, (£) |dz = 0. (101) 


This equation holds for any part em, so that, e.g., 
f at (x) dx = 0, (102) 
em 
where x; (x) is the positive part of 2,(z), i.e. zy (x) is equivalent to zero 
on em. If m increases indefinitely and use is made of (94), nj (x) is 
seen to be equivalent to zero in [a, b]. We can similarly assert the same 
thing for 2; (x), nz (x) and xz (z), and the theorem is proved. Notice 
that we have only made use in the proof of the fact that K(x) is any 
given non-negative function, finite and measurable almost everywhere 
in [a, b]. We shall in future write 1, for the analogous lineal for the 
product K,(xz)z(z). It is also everywhere dense in Ly. 
THEOREM 2. If (93) defines a bounded operator, the conjugate operator 
is the integral operator with the kernel 


K* (x, y) = K (y, 2). (103) 


On writing A for operator (93) and using the definition of the con- 
jugate operator (Az, y) = (x, A*y), we can write 


b b b PRENE, 
J [|E (zy) ty) dy]g@)dr= fr (y) g (y)dy, (104) 


a 


where g*(x) = A*g(x), and it is assumed that t(y) € L. 


612 HILBERT SPACE (174 


On using the inequality 
b pets b 4, ob Ys 
f | K (2, y)g(z)|da < (f |K (ay) Pdz) -( {|g (2) [Pdz) = 
a a a 


= K,(y)- lig 


and the fact that t(x) € l, we can say that one of the iterated integrals 
exists for | K(x, y)(g)xt(y) |, so that we can change the order of in- 
tegration in the integral on the left-hand side of (104), i.e. we can 
rewrite (104) as 


b b 
$ (y)| | K (x,y) g (x) dx — g* (y) |dy =0. 
a a 

A repetition of the proof of Theorem 1 shows us that the difference 
in square brackets is equivalent to zero, and we can write, on passing 
to the conjugates: 





g* (y) (105) 


I 
us 
by 
F 
= 
Q 
O 
Pu 
8 


whence the theorem follows. In view of this theorem, the equation 
(Ax, y) = (x, A*y) can be written for integral operators as 


b b —— b b = ; 
S [S E (z,y) f ly)dy]g (x) de = f [f K (x, y)g (x) dx] (y)dy, (106) 


which amounts to saying that the order of integration can be changed. 
The corresponding double integral may not exist. If, in addition to the 
conditions indicated, the kernel satisfies 


K (x,y) = K (y, x), (107) 
operator (105) is the same as (93), i.e. (93) is a self-conjugate operator. 


174. Completely continuous operators. We saw above that, if a 
function K(x, y) measurable in the square 4, satisfies 


f f |E (zy) Pda dy < + œ, (108) 
4 


operator (93) is completely continuous in L}. What was said in [135] 
holds for the integral equation 


b 
§ K (x,y) f(y) dy = Af (£) + (2), (109) 


where (x) is the given and f(x) the required function of L, in [a, b]. 


174] OOMPLETELY CONTINUOUS OPERATORS 613 


If operator (93) is self-conjugate, i.e. K(x, y) is equivalent to K(y, £), 
what was said in [136] is applicable to equation (109). Let us show 
that integral (108) is equal to the absolute norm of operator (93). A pre- 
liminary lemma is needed. 

LEMMA. If n(x) (n = 1, 2, ...) is a closed orthogonal system in the 
interval [a,b], Pm nl, Y) = Pn(X)Pnly) is a closed orthonormal system 
in the square Ay. 

By hypothesis, 

b = 
J px) Py (2)dr = 


a 


0 for ik, 
1 for i =k. 


The functions Pm, n(x, y) obviously belong to L,(4,), and by Fubini’s 
theorem, 


JJ Pn £, Y) Ppq (2, Y) Ardy = fon (x) p, (x) dx f Pn (Y) Pa (Y) dy, 


whence it follows that system pmn(x, y) is orthonormal in Ay. To prove 
that this system is closed, we only need to show that, if f is orthogonal 
to all the gm n, it is equivalent to zero in Ay [58]. 

Thus, let 


S S ECE Y) Om (2) Pn (y) dz dy = 0, 
A, 
i.e. 
b b 
SL J feu) en y) dy} Pm (x) dx = 0, 


and, since system 9,,(x) is closed, we obtain on passing to conjugates: 


ic z, Y) Pn (y) dy = 0 almost everywhere with respect to x in [a,b], 


and it can be asserted, by the same arguments, that f(x, y) = 0 almost 
everywhere in 4, and the lemma is proved. Let bmn be the Fourier 
coefficients of the kernel K(z, y), belonging to L, in A; then, by (108), 


bmn = Ss E (2, Y) Pm (T) Pn (Y) dx dy. 


We find the square of the absolute norm of the operator A corres- 
ponding to this kernel [138]: 


N?(A)= >| (Apr Pm) | = Pa 


m,n 


= 2 
i | f K (x,y) Pn (Y) ay| Pm (£) ax 


a 








fS E (T, Y) Pm (2) P n()dzdy = = 2 lPmal?s 
Ay 





514 HILBERT SPACE (175 


but, by the closure equation, the last sum is equal to the integra] of 
(108). If condition (108) is fulfilled, and A is a self-conjugate operator, 
we have 


S J |E (æy) Pde dy = 5 i, (110) 


where 4, are the eigenvalues. 
Everything said in [135] and [136] is preserved in the case of an 
infinite interval, even for the multi-dimensional operator 


g (2) = | K (x,y) f (y) dy, (111) 
D 


where z(£;, £z, - - -, En) and Y(Yi, Yo --+, Yn) are points of n-dimensional 
space Rn; dy = dy,, dy, ..., dy, and D is a domain of Rp. 


175. Spectral functions. Let K denote the operator (93), which we assume 
to be completely continuous and self-conjugate. We shall describe how to form 
the spectral function č} for it, and the resolvent R, = (a — IEY &¢. 

We introduce instead of ë} another function which is expressible as an integral 
operator, viz. we put 


6, = &, for å < 0; 0, = 6, — E for 2 >0 and 6,=0. (112) 


Since the spectrum is purely point, we can say that č} is a projection operator 
onto the subspace of the eigenfunctions gg(x) for which A, < A. The projection 
of the function f(z) onto the one-dimensional subspace of the eigenfunction 
plx) is the product a, p(x), where a, are the Fourier coefficients of f(z): 


b 
Ay Py (x) = Pr (2) Pk (Y) Í (Y) dy- 


Thus the projection operator onto the one-dimensional subspace is an in- 
tegral operator with kernel p(x) p(y), and we can write 


b 
Of (x) = f SY ou (x) Pr (y) f(y) dy for A<0, 
a ASA 


where the summation is over the k for which A, < å, and the sum contains a 
finite number of terms, by virtue of the property of the spectrum of a complete- 
ly continuous operator. Thus, when 4 < 0, 0, is an integral operator with the 
kernel 
O(a, ysA) = 2 oy (x) % (y) for A < 0. (113) 
ABSA 


On using (112) and what has been said regarding @,, we can say that 0}, 
with A> 0, is an integral operator with kernel 


8 (x, y; å) = PAL (x£) pK Y) for 4 >0, (114) 
k 


where the sum again contains a finite number of terms. When A passes through 
an eigenvalue, the kernel has a jump. It follows at once from (112) that 63 = @, 


176] THE SPECTRAL FUNCTION (CONTINUED) 515 


for 4 < 0 and 63 = —9, for å > 0, which can be written as 
b 
femnat ys 8) do = eya | 


a 


+forå <0, 
— for >Q. (116) 
The function R, f(x) is obviously a solution of equation (109) with A = 1, 
on the assumption that J 4 0 and does not coincide with one of the åy. Another 
expression can be found for the resolvent, by starting from the equation 


+a 


R f(x) = f Sah (116) 
where the integration is actually carried out over a finite interval containing 
the spectrum of the operator. 

Let 4 = 0 not be an eigenvalue. On replacing ¢, by 68, in accordance with 
(112) and taking into account the extra jump of 8, on passing through 
4 = 0, equal to (— E), we can write the above equation in the form 


-8 


b 
f da [$ ete, y; 4) My) dy] 


Rifle) =- + fle) + tim | | —#*——— 4 


ey>+0 
£,->+0 


b 
7 dal f elz, y; 4) f(y) dy] 
jee] 


IZI (117) 


€: 


176. The spectral function (continued). The spectral function was introduced 
for very general integral operators by Carloman in his work Sur Les Equations 
Intégrales Singuliéres a Noyau Réel et Symmetrique (1921). An exposition of 
this theory from the modern stand-point may be found in Stone’s Linear 
Transformations In Hilbert Space... (1932), and in N. I. Akhiezer’s article 
“Integral’nye operatory s yadrami Karlemana”’ (Integral operators with Carle- 
man kernels) (1947). Integral operators of a more general type than the self- 
conjugate bounded type (to be discussed in a moment) are investigated in 
these works. The case of bounded self-conjugate operators was considered by 
Hilbert, Hellinger and others. 

The results for this last type will be given in broad outline. The kernel 
K(x, y) of a bounded self-conjugate operator K can be approximated by kernels 
Ka, y) (n =1,2,...), which correspond to completely continuous self- 
conjugate operators. This gives us the possibility of showing that the operator 
8,, given by (112), where €, is the spectral function of operator K, is an integral 
operator for 4 # 0, and that the formulae hold: 


b 
0, f(x) = J biz, y; A) f(y) dy , (118) 


b +> b 
J Ke y) fy) dy = f ad, {J e(z, y; 4) f(y) dyl, (119) 


=æ 


516 HILBERT 8PACE [176 


as also (115). The integral on the right-hand side of (119) is to be understood 
as improper as regards 4 = 0. 

We shall assume that operator (97) does not have a point spectrum, and we 
deduce for it the formulae of the general theory of [149]. Let w(x) be the ele- 
ments of L, corresponding to the y, of [149], where it can be assumed that the 
@,(z) form an orthonormal system. By making use of 6(#, y; å), a complete set 
of differential solutions can be obtained: 


b 
alz, A) = f o(@, y; A) w,(y) dy forå <0, 
a 


b 
ngle, A) = J elx, y; 4) ogly) dy + wlx) for A>0. 


The operator 6, has a jump equal to (—E) at the point 4 = 0. 
We have 


b 
eKA) = J |a, A) Pde, (120) 
and, if p(x) and y(x) are any two elements of L,, on putting 
b 
IA) = J mle, A) plaj de, (121) 
b 
h(a) = f 1 (x, 4) p(x) dx, (122) 
we can write the EOR of [149] as 
= gal) Ahl) 123 
[ne ) oe) dr = zf Lae. (123) 
ad dgy(A) dhy(A) 
dzd Sg) Ahat) 124 
| ji K(x, y) p(y) v(@) dz dy = > f> do,(A) (124) 
aa 
_dgg(H) u) dhg(u) 125 
Jarom ve) da = X J ae (125) 


where m and M are the bounds of the operator, whilst the integral on the left- 
hand side of (124) must be understood as iterated in any order. If p(x) belongs 
to the lineal J on which integral (99) has a meaning, (99) exists as a double 
integral. The remaining formulae of [149] have the form 


M..———= 
dg (A) 
wa) = 3 “aac: dit, (2, 4), (126) 
$ m 
[Ken ewdy = z f a Jun day (£, 2), (127) 


a 


dgx (x) 5 
é = BEd u). 128 
a P(x) sf do, (#) my (©, H) (128) 


177] UNITARY TRANSFORMATIONS IN L, 517 


In the case of an infinite number of terms, the series must be convergent 
to the quantities on the left. 
The differential solutions x(x, å) satisfy the equation 


b a 
$ K(x, y) aly, 4) dy = f u da(a, x), (129) 


m 


where we assume z(x, m) = 0 as usual. The orthogonality properties of such 
solutions are expressed by 


b ——— 
J Ay 7p (a, A) - Ay 74 (a, À) dæ = 0. 
a 


(p # q)- 


The above formulae can be obtained by starting from any complete ortho- 
gonal system of differential solutions 2,(z, A). Examples of integral operators 
in L, will be considered in subsequent sections. 


177, Unitary transformations in L, Not -every unitary trans- 
formation g(x) = U(x) in L, is expressible in the integral form. 

The identity transformation (xz) = f(z) can be quoted as an 
example. But it can be written in the integral form if we pass to the 
primitive function: 


f ply) dy = T K(x, y) fly) dy, 
0 


where (—°o, +2) has been taken as the basic interval, K(x, y) = 1 
for 0 < y < x and K(x, y) = 0 for y < 0 and y > q, if x > 0, and 
similarly for æ < 0. A similar result holds for any bounded operator A. 
Let (—o, +2) be the basic interval. We fix some value of x and 
consider the primitive for Af(x): 


Up) = Î LAKOJ at. 
0 


We have the distributive property for Uf), and, by Buniakowski’s 
inequality: 


AI < [ f |Af(t) par fat P<AllVelfl (>). 


It follows from this that l(f) can be regarded as a linear bounded 
functional, depending on the parameter x, and, by the theorem of 
[123], we have 


{lafoae= Y Kle, y) fy) dy, 
0 =% 


518 HILBERT SPACE [177 


where K(x, y) € L, (—9°°, +œ) with respect to y (for any x € (—%, 
--9°)) and 

+o 

§ |E(z, y)? dy < |||? |e]. 

We shall shortly prove a theorem that gives the general analytic 
form of a unitary transformation with the aid of a passage to the 
primitive function. This theorem was first proved by Bochner (Annals 
of Mathem., Vol. 35, No. 1, 1934), who considered the interval 0 < £ < 
< +2. The proof does not depend on the choice of interval. We shall 
take (—°co, + °°) for definiteness and write L, for the class of functions 
of L, in (—œ, +œ) (see F. Riesz and B. Sz. Nagy, Leçons d’ Analyse 
Fonctionnelle, p. 316). 

THEOREM. Let K(x, y) and L(x, y) belong to L, with respect to y for any 
fixed x of (~œ, +œ), whilst the formulae hold for all a and b of 


(~, + 


): 
F Kia,y) KG ya 
J: i a min (|a|, |b|) for ab > 0 


+o pa oe 0 for ab <0 (130) 
§ La, y) L, y) dy 
and 
b a 
J K(a, y)dy = J Llb, y) dy. (131) 
Now, the formulae 
a +o0 
fondy = | Lay) fy)dy, (132) 
a -+20 
f fy)dy= J Kay) oy dy (133) 


define a unitary transformation p(x) = Uf(x) and its inverse. 
Conversely, if p(x) = Uf(x) is a unitary transformation, there exist 
functions K(x, y), L(x, y) with the above-mentioned properties, with the 
aid of which U and U-1 can be expressed by (132) and (133). 
Let us start by proving the first half of the theorem. 
We introduce the function /,(2x): 
1 for O<a<a 
fae) = |g for x < 0 and r>a 
l fra<xgr<0 
0 for s<a and z> 0 


(a > 0); 


falz) = | (a < 0) 


177] UNITARY TRANSFORMATIONS IN L, 519 


and f(x) = 0, and we define operators U, and FV, as follows: 
K(a,&) = Us falz); L(a, £) = Vofa(x). (134) 


On forming all possible finite linear combinations of functions /,(z) 
for different a, we get a lineal I of piecewise constant functions, i.e. 
functions that take constant values on a finite number of finite inter- 
vals and are zero outside these intervals. The values of the functions 
at the ends of the intervals have no importance here, since equivalent 
functions are regarded as identical. We extend operators U, and Vo 
on to the lineal J, taking as our basis the distributive properties of U 
and Vo. It may easily be seen that this extension is unique. Let Uf 
and Vf denote the distributive operators on J. 

Formulae (130), (131) and (134) give us 


(Uo far Uo fo) = (fa fo); (Vo fæ Vo fo) = (fa fo) > (135) 
(Uo fa fo) = (far Vofo) - (136) 


Since the operators are distributive, we can write the same formulae 
for U and V onl: 


(Uf, Ug) = (fsg); (Vf, Yg) = (f9), (137) 
(Uf, 9) = (f, V9), (138) 


where f and g € L. It follows from (137) that U and V do not change 
the norms of elements on J, and, since the lineal J is dense in L, [60], 
we get a unique extension of U and V (in continuity) to the whole 
of L,. In view of the continuity of the scalar product, (137) and (138) 
are preserved in L,, and operators U and V do not change the norms 
and scalar product in L,. It follows from (138) that V = U*. On re- 
placing f by Vf in (138) and using (137), we get UU* = E, and simi- 
larly, on replacing g by Ug, we get U*U = E. Hence it follows that U 
is a unitary transformation and V is its inverse [137]. It remains to 
obtain (132) and (133). The first is obtained from (138) by putting 
g(x) = f(z), and the second by putting f(x) = f,(z) and passing to 
the conjugates. 

Let us turn to the proof of the second part of the theorem. Given 
the unitary operator U, and V = U-1! = U*, we form the functions 


K(a, x£) = Uf,(x); Lla, x)= U" f(x). (139) 


On introducing the notation g(x) = Uf(x) as above, and using the 
fact that U is unitary, we get 


(Q, fa) = (Uf, fa) = (f. UTI fa); (f; fa) = (U7! Q, fa) = (p, Ufa) ’ 


520 HILBERT SPACE [178 


which, by (139), leads to (132) and (133). Formulae (137) and (138) 
hold for the U and V given above. On putting f(x) = /,(x) and g(x) = 
= f(x) in them, we get (130) and (131). The proof is complete. 


178. Fourier transformations. Watson considered the interval 
O < x< +o and a kernel of the form 


K(a, £) = Hen. ; La,2)= aen) i 


on the assumption that (0) = 0 and x(x)/x € La(0, +). 
Conditions (130) take the form 


f #62209 de = min (a, b) (a> 0; b> 0), 
T 


whilst (131) is fulfilled automatically. 

Let us turn to Fourier transformations for which the basic intervals 
is (—œ, +) and 

Ee 1 &@”—ı 
m o ee ce 

The modulus of the numerators does not exceed 2, and both functions 
belong to L,. Condition (131) is easily verified. Let us verify (130). 
These reduce to a single expression, as above: 


K(a, 2) = 








(140) 


' (141) 


1 oP e-a pen ä min (|a|, |b|) for ie 
eed = a= 0 for ab< 


— 0 


By differentiating with respect to the parameter a, it is easily seen 
that 


| 5dr = n]a] (a real), (142) 


The integral / is easily transformed to 
een. b . ,a—b 
1 f sin? > 2 + sin? > x — sin? z T 


be ee ae, 


i r 





= 


and application of (142) gives us 


1 
=z llal|+]b]— |e — bj), 


178} FOURIER TRANSFORMATIONS 621 


whence (141) follows. We thus obtain a unitary Fourier transformation, 
for which we employ the symbol T: F(z) = T/(zx), in the following 
form: 





a + 
1 ei _ 1 
ae yen f ———— fix)dz, 


—oo 


ao 


a + 
1 eft —] 
f(z) dz = —— - F(x) dz. (143) 
J 2x i) 





axe 


Another possible form of the Fourier transformation was employed 
earlier [II; 160]. 

Suppose first that f(x) vanishes outside some finite interval [—n, 
+n]. By hypothesis, f(x) € L, in [—n, +n], so that it € L in [—n, 
+n]; thus, given any real y, the integral exists: 

+n 
Ps f oO f(x) da. (144) 
V2a 
“n 

It may easily be shown that F (y) is continuous and has derivatives. 
We have ]e-®"f(x) | =| f(x) |, and we can integrate with respect to y 
over a finite interval under the integral sign: 


tn 
e`] 


s ae: 


—n 


On comparing this formula with (143) and noting that a is arbitrary, 
and that f(x) vanishes by hypothesis outside [—n, +n], we can say 
that F,(y) is equivalent to F(y), i.e. in the present case the Fourier 
transformation can be written as 


+n 
1 
Fy) =— f e ; 1 
W= y | od (145) 
In the general case, the integral 
1, 
—ityx 
= fe f(a) dx (146) 


=æ 


may be meaningless, since the fact that f(x) belongs to L, in the 
interval (—, +) does not imply that it belongs to L,. Let us take 
the function fnm(£), which is equal to f(x) for —n < x < m and zero 


522 HILBERT SPACE [178 


for x > m and « < —n. As we have seen, the Fourier transformation 
of this function is given by 
m 
Pram) = = f e fanla) de. (147) 
V2 


-n 





But fnm(£) => f(x) in L, as n —> œ and m —> œ, so that Tfnm (x) => 
=> Tf(x). If lim is used to denote the limit in the mean (in L), we can 
write the Fourier transformation for any function of L, in the form 


F(y) = Tf = lim —— re T e f(x) da. (148) 


noo 
Moo -n 


If f(x) belongs to L, as well as L, in the interval (—, +°°), given 
any real y, integral (146) exists, this being the limit of integral (147) 
as m and n — œ. But, if the limit in the mean, and the limit every- 
where, exist, they are the same, so that the Fourier transformation 
can be written in the present case as 


4-00 
Fly) =T} = a f e)* f(x) dx (f(z)€L, and L). (149) 


Everything said above is true for the conjugate (inverse) trans- 
formation. On putting m = n, we have instead of (148): 


+n 
f(y) = T* F = lim = Í ed* F(x) da (150) 


no 
and instead of (149): 
+00 
fy) =T* F = TAk F(x)dx (F(x) €L, and L,). (151) 
y2z 
A convolution formula, to be obtained shortly, holds for the Fourier 
transformation [cf. IV; 45]. Let g(t) and f(t) € L}. Given any real z, 
g(x — t), regarded as a function of t, clearly belongs to L,. Let us find 
Tig(x — t) ]: 


T = lim —— — thet dt = 
[g(x — t)] lim za Je je 


x+a 1 xta 
= lim ra J u) e-a) dy = e7 9* lim —— t) ed! dt, 
a glu) RE g(t) 
x-a 


i.e. T[g(x — t) ] = e ™P* g(t) and we can write, on recalling that the 


178} FOURIER TRANSFORMATIONS 523 


unitary transformation does not change a scalar product: 


"F gla — fat = S emt T* [g(t)] TIO) dy, 


where 


+n 
T* [g(t)] = lim m. Í g(t) e dt; 





TIO] = j f(t) ol dt = T* f, 


po ye 
and finally, 


oo 


+20 n 
{ gz —t)fit)dt= f Gyly) Fily) e dy, (152) 
where G,(y) = T*g and F,(y) = T*f. 

The basic theorem can be proved in precisely the same way for 
functions of several variables. Here, the unitary transformation T 
is given by 





+m, +m, 
Tf = lim [ flay. - e, Ep) ett tay) dr.. dz, (153) 
Mz 00 (27) 2 _im, =n 


and the inverse transformation by 





+m, +My 
T* (F) = lim —; fine f Poe -r Yn) ewt dys. dp 
m> (2n)2 m, a 


(154) 


Returning to the case of a single variable, certain further properties 
of the Fourier transformation may be mentioned. If we replace x by 
(—2) in (148), compare with (150) and remember that T* = T-1, we 
get T?/(x) = f(—2), and similarly, T* F (y) = F(—y). If f(x) is an even 
function, the transformation T yields a E a to an 
even function, and we have 


n 
F(y) = lim y= fre ) cos xy dg; f(x) = lim > f Fw) cos ay dy. 
n> Tosco m ò 
These formulae give a unitary transformation for functions of L, in 
the interval (0, °°). On changing the sign and multiplying by ¢ (these 
operations are clearly unitary transformations), we obtain, in the case 


524 HILBERT SPAOE [179 


of odd functions, the following mutually inverse unitary transfor- 
mations in (0, œ): 


n n 
F(y) =lim j= fro sin zy dz; f(x) = lim + f Fly) sin zy dy. 
n=» n n= n 
0 0 

179. Fourier transformations and Hermitian functions. We shall prove next 
that the Fourier transformation has four eigenvalues +1, +7, to which there is 
a corresponding closed system of eigenfunctions, orthonormal in (— œ, + œ); 
these functions are, in fact, the Hermitian functions [III,; 156]. Let us recall 
the basic formulae concerning Hermitian functions. The Hermitian polynomials 
are defined by 


n 


H(z) = (— 1) ee © 


da" 





(o-) 


and the Hermitian functions by 
x! 
Yq(z) = © 3 H,(2). 
They are orthogonal in the interval (— œ, + œ). The normalized Hermitian 


functions are: 
1 


Pn(X) = ———— ¥nl2) « 
— Å 
2° Ynya 
They form a closed orthonormal system. Let us show that 9,,(x) is the eigen- 
function of the operator 7, corresponding to the eigenvalue (—i)", i.e. 


Tn == (— i)" paly) - (155) 
In other words, we have to show that 


+a x? y 
1 —lyxt+ > a” = n 
= —— —— (e—*") dz = (— t)ne? 
ay fe E (0-#) de = (— i) 





ay (e-y*). 


We integrate by parts and note that the terms outside the integral sign 
vanish: 














+e : 
(— 1)" d” ~iyx+ = 
L= Tar e-x* a 2) da. 
eo, x 
We now multiply by e)°/2 outside and by e-}*/2 inside the integral sign: 
_p 2 n liy) 
EEE ae 
z 
a ae 1 i 
a= 1)" 2" eT NE d” ot (x—iy) dx= 
y2 dy” 
Mogan See et ce ies 
seet a = f eTa T TTY ay 
yY To 


180] THE OPERATION OF MULTIPLICATION 525 


On differentiating with respect to the parameter y, the last integral is easily 
shown to be equal to e-)*/2 , and (155) is therefore proved. If account is taken 
of the closure of the system of Hermitian functions, it can easily be shown 
that the points A= +] and A= +i exhaust the spectrum of operator T. 


180. The operation of multiplication. Let us consider the operation 
of multiplication by the independent variable in a finite interval, 
æ = 0 being taken as the left-hand end of the interval. In other words, 
we consider L, in the finite interval [0, a], and the operation of mul- 
tiplication by the independent variable: 


Af(x) = af(x). (156) 
We have 


(Af, 9g) y= af(zx) g(x) dx and (Af, f) = f alfa) Pdr, 


whence it is clear that A is a self-conjugate operator, and that its 
norm does not exceed a. If we take f(x) that are non-zero only in a 
small neighbourhood of x = a, it may easily be seen that the norm 
of A is exactly equal to a. Given the condition || f || = 1, the bounds 
of the quadratic form (áf, f) are: m = 0 and M = a. The equation for 
the eigenvalues and eigenelements has the form 2f(x) = Af(x) or 
(x — A)f(x) = 0, whence it is clear that f(x) is equivalent to zero, i.e. 
there are no eigenvalues, and the spectrum is purely continuous. 
The resolvent clearly has the form R,f(x) = f(x)/(z — A). If 4 lies 
outside [0, a], then R,f(x) € Z,. If 2 is in [0, a], Rif(x) does not belong 
to L, for every f(x). The operator (A — AE)f(x) = (x — A)f(x) here 
transforms L, one-to-one into the lineal M, of functions (x) = 
= (x — A)f(x) such that y(x)/(~—A) € L,. Let usfind the spectral func- 
tion &,, where A must be assumed to belong to [0, a]. On observing that 
a j D forà <z, 


lim i SoS rpa do = 2i lim (arctan 2 ta 


r340 | r>40 T QniforA>x, 


we obtain for any elements f(x) and p(x) [144]: 
a 


(Zf y) = lim ge Lf fi erage Ma) pa) da] do = fH) la) de, 


ò 
whence it follows that 


fiz) for <A, 


0 for a> A. (190) 


Z, f(z) =Í 


526 HILBERT SPACE [180 


On taking f(x) =1, we get the differential solution a(x, A) = 1 
for « < A and a(x, A) = 0 for x > A. It may easily be seen, on using 
(157) and property 11 of [52], that there are no solutions orthogonal 
to it. 

Let us consider the more general self-conjugate operator 


B(x) = (2) f(x) , (158) 


where w(x) is real, measurable and bounded in [0, a]. The equation for 
the eigenvalues and eigenelements has the form [w(x) — A]f(x) = 0. 
Let K, be the set of x satisfying w(x) = A. If the measure of K; is zero, 
A is not an eigenvalue. If the measure of K, is greater than zero, A is 
an eigenvalue, and any complete system of functions, orthogonal on 
the set K,, is a complete system of eigenfunctions corresponding to 
the eigenvalue in question, where these functions must be assumed 
zero outside K,. If the measure of K, is zero for any A, operator (158) 
has a purely continuous spectrum. Its spectral function is given by an 
expression analogous to (157): 


f(x) for w(x) < A, 


(159) 
0 for w(x) >A. 


&, f(x) = | 
Everything said may be readily extended to the case of functions 
of several variables. For instance, we can define a self-conjugate 
operator of multiplication by the independent variable: Af = 2,f, for 
functions f(2,, £z, .-.,%,) that belong to L, in some finite interval 
as <S Us < bs (s = 1, 2,..., n). This operator has a purely continuous 
spectrum in the interval a, < A < bx and its spectral function is 
defined as follows: 


Erf lTi Lys. 


m= fee -s 2n) fora, < å, (160) 


0 for x, > A, 


Let us return to the case of one variable. The operator of multipli- 
cation by the independent variable in an infinite interval is no longer 
bounded. We shall consider it below. If we take operator (158) and 
assume that (x) is a bounded function in an infinite interval, we 
now get a bounded linear operator. We shall thus take as our basis 
space L, in (—°°, +09), and let w(x) be a real, bounded and measur- 
able function in this interval. Now, (158) defines a self-conjugate 
bounded operator. If œ(x) is continuous in the closed interval [—°, 
+œ], the bounds of the operator coincide with the minimum and 
maximum values of o(z). 


181] KERNELS THAT DEPEND ON A DIFFERENCE 527 


181. Kernels that depend on a difference. If we use operator (158) in the 
interval (— œ, ++ œ), and pass with the aid of the transformation 7 to the 
unitarily equivalent operrators, bounded self-conjugate integral operators with 
kernels depending on a difference are easily formed. 

Let us outline the method. The unitary equivalent of (158): B’ = T*BT, 
is evidently given by 


+>% 


B’ f(x) = = f [ew f 10 oTi ar] e”? dy. 


—oo 


Here and below, we shall simply write the integral with infinite limits instead 
of lim. Assuming that w(y) is summable in (— œ, + œ) as well as bounded, 
n= oo 
and that f(t) of L, is also summable, we can change the order of integration, 
and obtain 
+ co +o 


1 2 
B fla) = f [i fo) 09 ay] roae 
or, on introducing the function 
“hes 
wy) eo" dy = T* w, (161) 


g(u) = ra J 


we can write the operator B’ as 
+00 


f g(x — t) f(t) de. (162) 


—oo 


B’ f(z) = ae 


As we know, the spectral function B’ is given by č} = T*,T, where č} 
is the spectral function of B. If the kernel satisfies condition (97) of [172], as 
will be the case in the following examples, i.e. 


+00 
_ J lgu) |du< +, (163) 


(162) is evidently applicable to the whole of Z,, and not merely to the f(z) 
which are summable in (— œ, +00). Let us consider some examples of this 
method. 

1. Let = 2 

Bie) = -FF 

The bounds of the operator are m = 0 and M = 2. Given any 4 of the 
interval [0,2], the equation 2/(1 + z?) = å has not more than two roots and 
operator (164) has a purely continuous spectrum. 

The kernel of the peia B’ is given iA 


_cosyu oy -ii 
g(u) = j= [ae dy = j= Wy dy =V27 e 


-o0 


f(z). (164) 


and 


-+20 
BY f(x) = J eT fly) dy. (165) 


528 HILBERT SPACE [181 
The kernel obtained satisfies condition (94) of [172]: 
+e - 
J [K(@, y)] dy = 5 e”* dy EJ oY dy = 2. 


By (159), the spectral function of operator (164) is defined thus: 8, f(x) = 
= f(x) for 2/(1 + z?) < A and é, f(x) = 0 for 2/(1 + x?) > å, ie. 


_ fee) for |z] > p, 
& fle) = 0 for |z|<y, 
where u = ¥(2 — 4)/A, and 
aa 5 


fle) = T* 8, THe) =-5—( f + «ff fw o=% dt] dy, 
-o B 


i.e. <7 
& f(a) = = ie [ i KOKIA ar] eY dy — 


+u +a 
aa ii pi foo" at] e? dy. 


-4i — ob 
The improper integrals with infinite limits must be taken in the sense of the 
mean square approximations. On changing the order in the last integral (the 
possiblity of this is easily proved), and remembering that 7* T = E, we get 


sin y(x — t) 


a tae. (166) 


i 
Bi Me) = fle) — >f 
The operator B’ has a purely continuous spectrum, and &) f(x) must tend 
(in the mean) to zero as Å — 0, i.e. 
+00 
lim Ł f Sue =) fe) f(t) dt = fa). (167) 


Let us form the differential solutions for operator (165). It is easily seen that 
the homogeneous equation B’ f(x) = Af(x) has solutions cos wx and sin pæ 
not belonging to L,, i.e. 

+0 +00 
f e—|x-yl cos uy dy = 4 cos ya; f e-ix-y| sin uy dy =A sin us. 

On multiplying both sides by e`” dy/då and integrating with respect to A 
from 0 to A, or what amounts to the same thing, with respect to u from u = œ 
to u, we get the following two differential solutions: 


H 


a Zar — _ ga (SOS He smn) 
a(z, A) fe cos ux du e Eer Ipe)’ 
>” (168) 
# . 
= ys — _ e—a È 2 COS ux sin yx 
m(x, 4) = fe sin ux du = — èe (Se Fa Hara e) 


co 


181] KERNELS THAT DEPEND ON A DIFFERENCE 529 


These functions belong to L, and it follows from the method by which 
they are obtained that they satisfy equation (129) and vanish for A = 0, i.e. 
for u = œ. The factor eo“ is included so as to enable the integration to be 
performed from p = œ, and hence to obtain solutions continuous as far as 
4 = 0. Solutions (168) are mutually orthogonal [176] in the basic interval 
(— œ, + œ), since one of them is even and the other odd. 

Let us write down (120) and (121) of [176]. Simple working gives 


we 
@:(4)=a,2)= ze ™, 
and 


+o p -= 
g(t) = J [J o7” cos uyda] ply) dy , 


+% p 
g) = f [$ o7“ sin uy dy] ply) dy . 


The completeness of the system of solutions (168) can be proved with the aid 
of (128) of [176]. If we apply (123) to the real function p(x) of L, and the func- 
tion y(x), equal to unity in the interval (0, x] and zero outside it, we obtain after 
elementary transformations: 


x 
J p(x) da = 


oo +00 


+ 
=o sin ux f d (4-22) f : dy|a 
[22 i p(y) cos uy dy + ie hi l p(y) sin uy dy | du , 


which, given certain supplementary assumptions, leads to the ordinary Fourier 
formula. Notice that solutions (168) must be obtained by application of the 
operator č; to the functions 2,(x, 2), ie. to 1/(1 + x?) and 2/(1 + 2*); this is 
easily verified. If we had not included the factor e7” in the integration, and 
had integrated from yp = 0, we should have obtained the simple differential 
solutions (sin yx)/e and (1 — cos psx)/x, which become meaningless with 
A = 0, their norms increasing indefinitely as 4 — 0. 

Let us consider the general case of transformation (162), on the assumption 
that g(y) is a real even function satisfying condition (163). Operator (162) 
is now defined throughout L, and is bounded and self-conjugate. We can form 





+o foo 
Gt) = = if glu) e"! du; FO = = j fuel du, (169) 


where 


oo 
1 
| G1 <= 1 { g(u) | dee, 


530 HILBERT SPACE [181 


i.e. G (t) is a bounded function and F,(é) € La so that G,(t) F(t) € L}. It can 
be shown that (122) holds in this case, and can be written as 


+ 


+00 

pa) == fae —/ Od =F [A MOM = P G, FO), 
V2x yon. 

whence it follows that G,(t) F(t) = T*[p(x)], and, on taking into account 

the second of formulae (169), (162) is seen to be the unitary equivalent of the 

operation G,(¢) of multiplication by the independent variable. 

We must mention a type of kernel that leads to a kernel dependent on a 
difference. Let K(x, y) be a real symmetric kernel in (0, œ), which is a homo- 
geneous function of degree —1. If we replace x and y in the integral operator 
with this kernel: 

+œ 
p (z) =E (x, y) f (y) dy (170) 


by the new independent variables « = eS and y = œ, and replace p(x) and 
fly) by p(s) = 0" p(e°) and filt) = oY* f(e), we got the integral operator 


+ 0 
p(s) =JK (8, ¢) fı (t) dt 


with a kernel depending on |s — t|. For, K(x, y) = x71 K(1, y/x) since it is 
homogeneous, and we can write, on setting K(1, z) = œ(z): 


sit t-s 
K,(s,t) =e? © 5w (€75) =e? o (e7), 


where, in view of the symmetry of K(x, y), the last expression is an even func- 
tion of (t — s). Since ds = dz/x, given our change of variables, space L, of 
functions in (0, œ) is seen to become space L, of functions in (— œ, + œ). 
The norm of operator (170) can be found directly with the aid of the follow- 
ing simple theorem: 

THEOREM. If K(x, y) is non-negative, homogeneous of degree (—1) and 


oo ere o ca 
ÍK (z,1)x īdr=f K (l,y)y Ž dy =k, (171) 
0 0 


then 


[Z| = < k Ii FI Ig ll- (172) 





f È K (2,9) f (£) gly) de dy 
0 0 d 





Notice that, since the kernel is homogeneous, the integrals of (171) are equal. 
On rewriting the integrand as f(x) YK(z/y)"*. g(y) VK(y/x)"* and using Bunya- 
kovskii’s inequality, we get |I| < ~AYB, where 


a= fire ef [xen (2)ov]ae eit 
0 0 


and similarly, B = k || g||?, whence (172) follows. It follows from (172) that 


83] OTHER CONCRETE FORMS OF SPACE H 531 


the norm of the operator with kernel K(z, y) does not exceed k. In particular, 
if we put K(x, y) = W(x + y), since 

1 
z 2 


l+e« 


dz = 2, 





we obtain 


Ci @ gly) 
a apy W< ial 








On performing the above change of variables, it can be shown that the 
operator with kernel 1/(z + y) has a continuous spectrum in the interval 
(0, x]. 


182. Weak convergence. We have already investigated weak con- 
vergence in Lp. Let us recall the basic results for the case p = 2. If we 
consider L,(2), where Z is any fixed measurable set, the weak conver- 
gence p(x) “> p(x) is defined by 

lim f 4 (2) Pn (2)dz = f (x) p (x) dz, (173) 

n=» ë ë 
for any function y(x) € L(g). The necessary and sufficient condition 
for weak convergence is as follows: (1) the norms || pn || are bounded 
in L,(@); (2) condition (173) is fulfilled on the set of elements y(x) € 
€ L,(&), the linear envelope of which is dense in L,(@). In the one- 
dimensional case, if & is a finite or infinite interval, the second con- 
dition can be replaced by 


Š 


g 
lim { p, (z)dz = f p(x) dx, 
n—>æ c c 
where c is any fixed number of the interval in question and ¢ is any 
number from this interval. 

The following can be proved: if a sequence of functions ¢,(z) of L,(#) 
is weakly convergent to some function g(x) and is convergent almost 
everywhere on & to some function w(x) € L(@), piz) and w(x) must be 
equivalent. 


183. Other concrete forms of space H. In addition to lz and Le, 
a number of other useful concrete forms of Hilbert space may be men- 
tioned. Let Z be a measurable set of n-dimensional space and L, the 
space of functions, measurable and square summable on @, the measure 
being based on the Lebesgue measure or some other normal set 


532 HILBERT SPACE [183 


function. In the latter case, we have the Lebesgue-Stieltjes integral. 
Let us define space Lym as follows: An element of L,,, i8 a sequence 
of m functions of L>: (f,, fz, ---, fm), where f, € L; (k = 1, 2, ..., m). 
An element is zero if each of the fx is equivalent to zero. Multiplication 
of an element by a number, and addition of elements, are naturally 
defined by 


a (fis far -- -> fm) = (afi afz» - - -s Afm), 
Ooto- ofm) + Gas 92> -- 9m) = (A+ 9i fa + G2» -> -s fm + Im) 
and the scalar product by 
(z, y) = J Io + faga + -+ fmgm) do 


where dw is an element of R, in the case of the Lebesgue integral 
or the differential of the normal set function in the case of the 
Lebesgue-Stieltjes integral. It is easily verified that L,,m is a concrete 
form of separable Hilbert space. A linear operator y = Az in L,,m 
consists of m? linear operators Ay, (i, k = 1, 2,...,m) in L, with the 
aid of which the components of y are expressible in terms of the 
components of x: 


m 
=a a Ain tis 
k=l 


This form of Hilbert space is a realization of an abstract con- 
struction of Hilbert space H from given Hilbert spaces H,, H}, ..., Hm- 
An element x of space H is defined as a sequence of elements (2,, £y, 

., Zm), where zg € H,. An element 2 is zero if all the a2, are zero 
elements in the H, (k = 1, 2, ..., m). Multiplication by a number 
and addition of elements are defined by 


Q (Lis Zas -> -3 Em) = (AX, Ay, - - -p ALm) 
(Eis Las -o e3 Xm) (Yas Yor +- -9 Ym) = (21 F Yi To + Yar - -Em + Ym) 
and the scalar product by 
m 
(x, y= = | (Trs Yr) - 


Every space W$?(D) [112] of functions g(x) belonging to L,(D), 
where D is a domain of n-dimensional space, and having generalized 


184} CLOSED OPERATORS 533 


derivatives up to order / that also belong to L,(D), is a complete Hilbert 
space with the scalar product 


(pv) = ([e(@)y(z) + S Dep(z),Dy(w)]dz, (174) 
D 1 


agi 


where the summation is extended over all derivatives up to and in- 
cluding order J. It is assumed here that the domain is star-shaped with 
respect to any point of it, so that the property of generalized deriva- 
tives indicated in [111] holds. 

Let us consider space W$?(D). Functions g(x) belonging to it have 
limiting values on the surface S of domain D (8 is assumed sufficiently 
smooth). It is easily seen that the set of g(x) € W$?(D), satisfying the 
boundary condition 


g(x) |s = 9, 


is a complete Hilbert space with the scalar product 





? 


og (x) (x) 
fee a dz. (175) 








Me 


(p: 9) = f 
D 


> 
ll 


1 


The set of functions of WD) with scalar product (175) and 
without any boundary condition is also a complete Hilbert space, if 
we identify functions whose difference is equivalent to a constant, i.e. 
regard such functions as the same element of the space. 


§ 3. Unbounded operators 


184, Closed operators. Let us turn to a consideration of distributive 
operators, which may not be specified throughout the whole of H, 
and which are not assumed to be bounded (to have finite norms). 
The notation to be adopted is as follows. Let A be a distributive 
operator, D(A) its domain of definition (which we shall always assume 
to be a lineal), and R(A) the range of values of A. It is also a lineal, 
since A is distributive. If A establishes a one-to-one correspondence 
between elements of D(A) and F(A), the inverse operator A-t is defined 
on R(A). 

The necessary and sufficient condition for the existence of A~? is 
that the equation Ax = 0 has only the zero solution (on D(A)) {127]. 


534 HILBERT SPACE [184 


Operators A and B are said to coincide (be equal), which is written 
as A = B, if their domains of definition coincide and if Ax = Bz on 
all elements of this domain. We say that operator B is an extension of 
operator A, and write ACB, if D(A) belongs to D(B) and Ax = Bz 
for x € D(A). The symbol ACB includes the possibility of A = B. 
If, when Az = Brz for x € D(A), the lineal D( B) is strictly greater than 
D(A), we write Ac B. Notice also that (A + B)x = Ax + Ba has a 
meaning if x € D(A) and z € D(B), whilst (AB)x = A(Bzx), if x € D(B) 
and Bz € D(A). Since we are not assuming that an operator is defined 
everywhere and bounded in norm, we cannot assert that it is contin- 
uous. However, an analysis of the fundamental properties that we 
have proved for bounded operators shows that many of the properties 
are consequences, not of the continuity of the operators, but of a 
weaker property — that of being “closed”. We shall turn next to the 
definition and analysis of this extremely important property of linear 
operators. 

DEFINITION. An operator A is said to be closed if the following condi- 
tion is fulfilled: if 2, € D(A) (n=1,2,...) and the sequences x, and 
Ax, have limits: tp => Xp, Aln => Yo, then x,€ D(A) and Ax, = Yo 

If an operator is not closed, the question arises as to whether it has 
closed extensions. If there exist two sequences xn and xn of D(A), having 
the same limit and such that Az, and Az}, have different limits, the 
operator A evidently does not permit of closed extensions. But if, 
given the same limits for x, and zp, we never get different limits 
for Az, and Az,, A admits of closed extensions, among which there 
is a minimal closed extension, which is usually denoted by A. Let us 
describe the formation of A. If x, € D(A), £n => Ly, and Ax, => Yo, we 
include x, in the domain of definition of A and put Az} = yo. In 
view of the above-mentioned condition, A is defined uniquely. By 
using the triangle inequality, A is easily shown to be a closed ope- 
rator. This operation of extension of A is called closure of A. If B 
is any closed extension of A, it is easily seen that AC B. 

THEOREM 1. If A is a closed operator, and B is an operator bounded 
on D(A), A + B is also a closed operator; A-t, if it exists, is a closed 
operator, and the set of solutions of the equation Ax = 0 is a subspace. 

All these assertions are proved directly from the definition of closed 
operator. 

THEOREM 2. If A permits of closure and has a bounded inverse A-1 
on R(A), A has an inverse A-1 which is defined in the subspace R(A) 
and is bounded. 





185] CONJUGATE OPERATORS 535 


If R(A) is a subspace, A-! is a closed operator. Otherwise, the 
bounded operator A-}! can be extended from the lineal (A) on to the 


subspace #(A). Let B denote the bounded operator thus obtained: 


(R(B) = R(A)). The equation Br = 0 is easily seen to have only a 
zero solution on R(B). Otherwise, there would exist a sequence 
£n € D(A) such that z, > 0, whilst Az, = y # 0. But this contra- 
dicts the fact that A admits of closure, since, if we take zp = 0 (n = 1, 
2,...), then Azn = 0. The operator A, = B-1is obviously the closure 
of A. The theorem is proved. 

Note. It will be seen that, in the present case, the closure of A is 
uniquely connected with the extension in continuity of the bounded 
operator A~. 

COROLLARY. If A is a closed operator and the bounded inverse A-t! 
exists on R(.A), R(A) is a subspace. 


185, Conjugate operators. We start from the following simple 
remark: if the element z is orthogonal to the lineal l, dense in H, z is 
the zero element. 

For, let (x, z) = 0 if x € l, and let y be any element of H. Since l is 
dense in H, there exists a sequence of elements z, of l such that 
Ln => y. By the property of z, (Xn, z) = 0, and in the limit (y, z) = 0, 
i.e. z is orthogonal to any element of H, and in particular, to itself, 
ie. (z, 2) = || z ||? = 0, whence z = 0. If J is not dense, there obviously 
exists an element z orthogonal to J. 

We shall always assume below that the operators are distributive. 

Suppose that the operator A is defined on the lineal D(A), dense 
in H. We form (Az, y), where x € D(A) and y is any element of H. 
There exist elements y such that (Az, y) can be written for any x 
of D(A) as 


(Ax,y) = (x,y*) (z€ D(A)), (1) 


where y* is an element of H. For instance, if y = 0, then (Az, 0) = 
= (x, 0) for any x of D(A). If the form (1) is possible for a certain y, 
the y* in this form is unique. For, if, for some y, we had (Az, y) = 
= (x, y¥) and (Az, y) = (x, y¥) for x € D(A), subtraction would give 
(x, yf — yž) = 0, i.e. y? — y¥ would be orthogonal to the lineal D(A), 
whence y* = yž. The set of elements y for which (Az, y) can be written 
in form (1) is obviously some lineal /*, and a distributive operator, 
transforming y to y*, is defined on this lineal. This operator is called 
the conjugate to A and is written as A*; thus y* = A*y and l* is D(A*), 


536 HILBERT SPACE [183 


and (1) can be rewritten as 
(Ax, y) = (z, A*y) (x€ D(A); ye D(A*)). (2) 


It follows from the above that the necessary and sufficient condition 
for the existence of A* is that the lineal D(A) be dense in H. As men- 
tioned above, A* is a distributive operator. We have the previous 
definition of A* for a bounded operator A. Let us now mention some 
properties of the conjugate operator. 

THEOREM 1. The operator A* is closed. 

Let a, € D(A*) and £n = £o, A*tn > Yo. By definition of Aa*, 
we have (Az, zn) = (x, A*z,), where x € D(A), and, on passing to the 
limit, we get (Az, x) = (x, Yọ), whence, by definition of A*, x, € 
€ D(A*) and A*x, = yy. This is what we wished to prove. 

THEOREM 2. If D(A) and D(B) are dense in H and Ac B, then 
B*G A*. 

The lineal D(B*) is formed by the elements y for which, given any 
x € D(B), we have (Bz, y) = (zx, y*), where y* = By. But, since 
Ac B, it follows from (Ba, y) = (x, y*) for x € D(B) that (Az, y) = 
= (x, y*) for x € D(A), ie. if y € D(B*), then y € D(A*) and Bty = 
= A*y = y*, and this means in fact that B*C A*. 

THEOREM 3. If D(A) is dense in H and A admits of closure, then 
(A)* = A*. 

We have (A) © A, so that (4)* S A*, and it remains to show that 
every element y of D(A*) belongs also to D(A*). By hypothesis, 
(Ax, y) = (x, A*y) for x € D(A), and it is enough to show that 
(Az, y) = (x, A*y) for x € D(A). If x € D(A), there exists a sequence 
% of D(A) such that ztn =g and Az, = Az. By hypothesis, 
(Atn, Y) = (£n, A*y) and in the limit (Az, y) = (a, A*y). This is what 
we set out to prove. 

THEOREM 4. If A* and (A*)* = A** exist, then AS A**. 

The lineal D(A**) of elements z is defined by the equation (A*y, z) = 
= (y, 2**) for y € D(A*), where z** = A**z. But we have from the 
definition of A*: (A*y, z) = (y, Az), where y € D(A*) and z € D(A), 
whence it follows that AS A**. 

Since, by Theorem 1, A** is a closed operator, it follows from A S 
Z A** that A admits of closed extensions, i.e. the existence of A** is 
a sufficient condition for A to admit of closed extensions. We shall 
see below that this condition is also necessary. Remember that the 
existence of A** is equivalent to the fact that D(A*) is dense in H. 


185] CONJUGATE OPERATORS 537 


THEOREM 5. If D(A) and R(A) are dense in H and the inverse A`? 
exists, there exist operators A*, (A-1)*, (A*)-1, and 


(A*) = (A7)*. (3) 


The existence of A* and (4-1)* follows directly from the fact that 
D(A) and R(A) are dense in H. Let x € D(A*; andy € D(A-1) = R(A). 
We have 

(x, y) = (x, AAY) = (A*a, Ay), 


whence it follows that A*x € D((A~1)*) and 
(471)* A*t =x (xe D(4*)). (4) 


This shows that the equation A*z = 0 has only a zero solution (in 
D(A*)), i.e. the operator (A*)~-1 exists, and in addition, it follows 
from (4) that 

(A*)"2 S (A)*. (5) 


Now let x € D(A) and y € (D(A-1)*). We have 
(x,y) = (Am! Ar, y) = (Aa, (A~)* y), 
whence (A-1)*y € D(A*) and 
A* (A=) y =y (yE D((A~)*)). 
But it follows from this equation that 
(AYES (AN, 


which, in conjunction with (5), yields (3). The theorem is proved. 
The solubility of the equation 
Az =y. (6) 


is bound up with the concept of conjugate operator. 

A closed operator A with domain D(A), dense in H, is said to be 
normally soluble if the necessary and sufficient condition for (6) to be 
soluble (not necessarily uniquely) is that y be orthogonal to the sub- 
space of solutions of 

A*z—=0. (7) 


THEOREM 6. The necessary and sufficient condition for the normal 
solubility of a closed operator A with a domain D(A), dense in H, is that 
R(A) be a subspace. 

The operator A* is closed, and the set of solutions of (7) is some 
subspace l. It may readily be seen that all the elements of l are ortho- 
gonal to R(A). For, if y € R(A), then y = Az and (y, z) = (Az, 2) = 


538 HILBERT SPACE {186 


= (x, A*z) = (x, 0) = 0. Thus, in view of the continuity of the scalar 
product, l is orthogonal to the subspace R(A). Let us now show that, 
if an element w if orthogonal to R(A), it belongs to l. In fact, (Az, w) = 
= 0 implies (Az, w) = (x, 0) = (x, A*w), i.e. A*w = 0 and w € l. It 
follows from what has been said that the whole of H is the direct sum 
of two orthogonal subspaces 

H=R(A4)@l, 





and the necessary and sufficient condition for A to be normally soluble 
is that R(A) coincide with R(A), i.e. that R(A) be a subspace. 





186. The graph of an operator. We can discuss, in addition to space 
H, the space H whose elements are pairs {x, y} of elements x and y 
of H, multiplication by a number and addition being defined in H by 


a fx, y} = faz, ay}; {zy} F {2z Ya} = {x + t2 Yı + Yz} (8) 
and the scalar product by 


({£ Y} {£2 Yal) = (£1, L2) + (Yq Y2) - (9) 
All the axioms are easily seen to hold. If A is an operator in H, the 
set F(A) of elements {x, Ax} of space H for x € D(A) is called the 
graph of operator A. All the elements of this set are uniquely defined 
by their abscissae (by the first element of a pair). Conversely, if all the 
elements of some set F of elements of H are uniquely defined by their 
abscissae, there exists in H an operator (not necessarily distributive) 
whose graph is the set F'. The fact that A is closed is easily seen to be 
equivalent to the set F(A) being closed in Ë. If A is distributive, and 
defined on a lineal, F(A) is a lineal in F. As above, we shall in future 
only discuss distributive operators, defined on lineals. 
Let us define an operator U in the whole of H by the equation 


U {x,y} = fiy, — iz}. (10) 
It is easily seen that U is a unitary operator and that U-! = U. 


Let A be an operator in H. We form the scalar product of an element 
of the set U F(A) with an element {z, y}: 


({i4z, — iz}, {x,y}) =i[(Az, 2) —(z,y)] (€D(A)). (11) 


Let A be an operator closed in H and D(A) be dense in H. Let us 
prove that H can be expanded into two orthogonal subspaces in 
accordance with 


H =UF (A)@ F (A*). (12) 


186] THE GRAPH OF AN OPERATOR 539 


If the element {z, y} is orthogonal to U F(A), it follows from (11) 
that (Az, x) = (z, y) for z € D(A), ie. x € D(A*) and y = A*r, or, 
alternatively, {z, y} € F(A*). Conversely, it also follows from (11) 
that, if {z, y} € F(A*), the element {x, y} is orthogonal to U F(A). 
It only remains to observe, in order to prove (12), that, since A and A* 
are closed, U F(A) and F(A*) are subspaces of space H. 

If A is not closed, but D(A) is dense in H, as above, we have instead 
of (12): 

H =UF(A)@ F (4*). (13) 


Further, the difference H © U F(A) is a set æ of elements {z, y}, 
orthogonal to U F(A), or what amounts to the same thing, to U F(A), 
i.e. by (11), the set # consists of pairs {x, y} satisfying the condition 
(Az, x) = (z, y) for z € D(A), so that the existence of the operator A* 
is equivalent to the fact that the elements of this set @ are uniquely 
defined by the abscissa x. 

In view of the above, we have the following lemma. 

LEMMA. The necessary and sufficient condition for the existence of 
operator A* is that the elements of the set 


H OUF (A) 
be uniquely defined by their abscissae. 
Let us now prove a theorem. 


THEOREM 1. If an operator A is defined on a dense set and admits of 
closure, there exist A*, A** and 


A** — Å. (14) 


Suppose first that A is closed. It follows from (12) and U-! = U 
that 
H = F(A) @UF(A*), 


i.e. the elements of the set F(A) = H © U F(A*) are uniquely defined 
by their abscissae, and, by the lemma, this set defines the graph of 
the operator conjugate to A*, i.e. of A**. But this set is F(A), whence 
A** = A. 

Suppose that A is not closed, but admits of closure. By what we 
have proved, (A)** — A. But, on the other hand [185]: (4)** = 
= ((A)*)* = (A*)* = A**, whence (14) follows. 

COROLLARY. If A* and A** exist, A admits of closure [185] and it 
follows from (14) that 


(A**)* — Ar** — A*., (15) 


540 HILBERT SPACE [187 


THEOREM 2. If A is a closed operator and D(A) = H, A is bounded. 

It follows from the hypotheses and Theorem 1 that D(A*) is dense in 
H and A = A**. Let us first show that there exists a positive number 
N such that || A*z || < N || x || (x € D(A*)). 

To do this, we consider the scalar product 


It defines in H a linear (bounded) functional 1,(y) for fixed 2 of 
D(A*), If zn € D(A*) and 2, => 0, the sequence of functionals Lx (y) 
tends to zero on an element y of H, so that a positive number N, 
exists, such that [100] 


iby (Y)| = (ys A* 2,)| < N yl. (16) 


If the operator A* did not have a bounded norm, there would exist 
a sequence Tn € D(A*), for which 2, => 0, whilst || A*z, || > œ. 
But this contradicts (16), since, on putting y = A*z, in (16), we get 
[| Ata? || < |] Aten |l, ie. || Aan || < Ny. Thus || A* || < N on 
D(A*). But A* can now be extended to the whole of H, and, since A* 
is a closed operator, D(A*) = H. The operator A** = A is conjugate 
to the bounded operator A*, i.e. is itself bounded, and the theorem is 
proved. 

Notice also that, if à is a number and A* exists, (A — 2E)* = 
= A* — JE also exists. If B is a bounded linear operator, given on 
the whole of H, whilst A has A*, then (A + B)* exists and is equal 
to A* + B*. 


187, Symmetric and self-conjugate operators. We shall be mainly 
concerned below with so-called symmetric and self-conjugate operators. 
DEFINITION. An operator A ts said to be symmetric if D(A) is dense 
in H and 
(Az, y) = (x, Ay) (17) 
for any x and y of D(A). 
It follows from (17) that any y of D(A) belongs to D(A*) also, and 
A*y = Ay for such y, i.e. 
Ac A*. (18) 


A symmetric operator is said to be semi-bounded from below if 
there exists a finite 


ma =inf(Ax,) for w€D(A) and |jz||=1, 


187] SYMMETRIC AND SELF-CONJUGATE OPERATORS 541 


whence it follows that 
(Az, x) > m, (z, z) (x€D(A)), (19) 


and the number m, cannot be replaced by a greater one. 

If ma > 0, the operator A is said to be positive definite, whilst it is 
called positive if m, > 0 [ef. 126]. 

Notice also that, if the lineal D(A) is dense in H and (Az, x) is real 
for all x € D(A), then A is symmetric, i.e. (Az, y) = (x, Ay) for z and 
y € D(A). This is proved just like Theorem 2 of [124]. 

DEFINITION. A symmetric operator A is said to be self-conjugate if 
A* = A, 

It follows from what has been said that, to prove that a symmetric 
operator is self-conjugate, we only need to prove the following: if an 
element x € D(A*), then x € D(A). By (18), a symmetric operator A 
admits of closure, and we have 


A= A** c A*;, A*** = At = A*, (20) 


A self-conjugate operator is obviously closed. 

Notice also that, if A is a real number, the operator A — AE will be 
symmetric if A is symmetric, and self-conjugate if A is self-conjugate. 

THEOREM 1. If a self-conjugate operator A has an inverse A-t, the 
range of values R(A) of A is dense in H and A- is a self-conjugate 
operator on R(A). 

If the lineal R(A) were not dense in H, there would exist an element 
z, different from zero, orthogonal to (A), i.e. 


(Az,z)=0 («€D(A)), 


or, what amounts to the same thing, (Az, z) = (x, 0). But this implies 
that z € D(A*) and A*z = Az = 0 for a non-zero element, which 
contradicts the existence of the inverse A-1. We have therefore shown 
that the lineal R( A) is dense in H. 

It follows from Theorem 5 [185] that (A-1)* = (A*)-}, or, since A is 
self-conjugate: (A-1)* = A~}, ie. A~! is in fact self-conjugate. 

THEOREM 2. If there exists for a symmetric operator A a number A 
such that both elements of the form (A — AE)«x and elements of the form 
(A — AE)x (x€ D(A)) fill the whole of H, then A is self-conjugate on D(A). 

We have to show that, if y € D(A*), then y € D(A). We have 
(Az, y) = (x, y*) for v € D(A), whence 


((A — ZE) z, y) = (2, y* — Ay). 


542 HILBERT SPACE [187 


By hypothesis, there exists at least one element z € D(A) such that 
y* — dy = (A — AE)2, and we can write, in view of the symmetry 
of A: 

((A — AE) z, y) = (a, (A — 1E) 2) = ((A — ŻE) 2, 2). 


But elements of the form (A — AE)x exhaust the whole of H, and 
it follows from the last equation that y = z € D(A). This is what we 
needed to prove. 

COROLLARY. If R(A) = H for a symmetric operator A, then A is self- 
conjugate. 

The proof only requires the application of Theorem 2 with 4 = 0. 

We shall prove in [189] that a proposition holds which is in a 
certain sense the converse of Theorem 2. 

In fact, if A is a self-conjugate operator and å is a non-real number, 
the operator A — AE has a bounded inverse, defined in the whole of H. 

THEOREM 3. If a symmetric operator A has an inverse A-1, bounded 
on R(A), then R(A*) = H. 

In view of the fact that the closure of a symmetric operator leads 
to a symmetric operator, and that A* = A*, we can assume for the 
proof that A is closed. On using Theorem 2 of [184], we can say that 
R(A) is a subspace. We have to show that, given any fixed y* € H and 
any x € D(A), the scalar product (x, y*) can be written as (Az, y). 
On writing Ax = z, we get (x, y*) = (A-z, y*), and, since A-! is 
bounded on R(A), the expression (A~z, y*) can be regarded as a linear 
(bounded) functional /,.(z) on the subspace F(A). It can be written as 
(z, y) [123], where y € R(.A), so that we get (x, y*) = (z, y) = (Az, y). 

This equation implies that any y* of H is expressible in the form 
A*y. This is what we set out to prove. 

COROLLARY. If A is a self-conjugate, positive definite operator, A-1 
exists, and ts bounded and defined on the whole of H. 

Since A is positive definite, i.e. 


(Aw, x) > a(z,2) (a>0), 


we have a || x || < || Az ||, whence it follows that A~-! exists on (A), 
and its norm || A-1 || does not exceed a-1. We can assert, on the basis 
of the previous theorem, that R(A*) = H, but R(A*) = R(A), so that 
A-1is defined on the whole of H. 

Notice, finally, that, given any symmetric extension A of a sym- 
metric operator A, we have AC A?*; also, a self-conjugate operator 
does not admit of symmetric extensions. 


188] EXAMPLES OF UNBOUNDED OPERATORS 543 


Both these propositions follows directly from the definitions of 
symmetric and self-conjugate operators. 

THEOREM 4. If A is a closed linear operator with a domain of values 
dense in H, the product A*A is a self-conjugate positive operator. 

The fact that A*A is positive follows from 

(A* Az, x) = (Ax, Az) >0 (x€D(A* A)). 
The symmetry of A*A on D(A*A) is clear from the equations 
(A* Ax, y) = (Az, Ay) = (x, A* Ay). 
Let us show that the equation 
(A*A + E)z=y (21) 
is (uniquely) soluble for any y of H. We take the space H introduced 
in [186], and its decomposition into 
H = F(A) @ UF(A*). 
It follows from this that the element {y, 0} is uniquely expressible as 
{y, 0} = {x, Ax} +- i{A*z, — 2}. 
Consequently, y = x + iA*z, z = —iAz, so that 
y=x+A* Ax (e€D(A*A)), 
i.e. equation (21) has a solution for any y € H. Let us now show that 
D(A* A) is dense in H. If we assume the converse, a non-zero element z 
must exist, orthogonal to D(A*A). By what has been said, it can be 
written as z= (A*A + E)xy, where x, € D(A*A), and for any 
x € D(A*A): 
0 = (z, z) =((A* A+ E) £o, £) = (£o, (4* A + E) 2), 
and we obtain, on putting 7 = £o, 
Il zo |? + (£o, A* Azo) = || £0 |? + || Avo |? = 9, 

i.e. £a = 0, which contradicts the foregoing. Hence D(A*A) is dense in 
H, i.e. A*A and E + A*A are symmetric operators. In view of the fact 


that R(A*A + EH) =H, the operator A*A + E is self-conjugate 
(Theorem 2). This means that A*A is also self-conjugate. 


188. Examples of unbounded operators. The present section will be con- 
cerned with various differential operators, from the point of view of the general 
theory of operators. All the operators will be unbounded. The discussion will 
take place in the complex Hilbert space L, of complex-valued functions of a 


544 HILBERT SPACE [188 


real variable. The formula for integration by parts, which will play a funda- 
mental role in what follows, has the same form here as in real space, viz. we have 
for any two functions g(x) and (x) of WD) in the case of a piecewise smooth 
surface S of a domain D [113]: 


dyr) 
Per y de = — J pla) po da ot [ome ya) cos (n, a) dS, (22) 





where n is the outward normal to S. 

We shall start with the simple differential operator D = id/dz. 

1. The operator D = id/dz in space H = L,{0, 1}. 

We have seen that, in the abstract theory, an operator A is specified by 
the domain of definition D(A) and by the rule for evaluating A on elements of 
D(A). 

Our present operator D can be uniquely defined on all functions of Z,[0, 1] 
having a generalized derivative in Z,{0, 1]. But the D defined on so wide a class 
of functions will not have a number of properties possessed by D when it is 
considered say on finite smooth functions. We shall therefore start, in this 
and the subsequent examples, from an initial discussion of the differential 
operator on a set of smooth functions subject to certain boundary conditions; 
we study its properties, such as symmetry, positive definiteness, inversion 
etc., then raise the question of extending it whilst retaining certain properties. 

The choice of the domain of definition of the original differential operator 
is not unique. In order to emphasize this, we shall make different choices in 
the examples below. 

Let A denote the operator D, considered on the set C0, 1] of all finite 
continuously differentiable functions on [0,1] (cf. the notation in [113]). 
The value of A on g of D(A) is calculated from the formula 


. _do(x) 
‘de? 





Ap = (23) 
and D(A) is dense in L,[0, 1]. 
The operator A is symmetric, since it follows from (23) that, for p(x) and 
p(x) € D(A): 
1 1 


d — ; dy(x) 
(apy) = | 125E ya ae = — ipie) -EL ae = (p, Ay). 
0 0 





The symmetry of A implies that A admits of closure, has a conjugate and 
A CA ¢A*. Let us find the functions that make up D(A) and D(A*). Let 
Pm(z) € D(A) and Gp (x) = p(z), Apm = Ym(e) => p(z), then g(x) € D(A) 
and y(x) = Agp(z). 

It follows from the theory of generalized derivatives that this closure of 
A extends D(A) to D(A) = WAP(O, 1] [113]. 

Each p(x) of WYO, 1] is an “absolutely continuous function, equal to zero 
at the ends and having a generalized first derivative of Z,[0, 1]. It could be 
shown that any such function belongs to wwo, 1]. The operator A on g(x) 


188] EXAMPLES OF UNBOUNDED OPERATORS 545 


of D(A) is calculated from (23), the only difference being that d/dz now denotes 
generalized and not classical differentiation. Let us now find the functions 
that make up D(A*). The function g(x) € D(A*), if there is a function p(x) € 
€ £,[0, 1] such that (Aw, p) = (a, y), Le. 


1 1 


dw —— —- 
f: Ia pix) dz = foa) p(x) dx (24) 
o 


0 





for all w(x) of D(A). But this implies [109] that p(x) has a generalized derivative 
dy(x)/dz, equal to —ip(x), ie. that p(x) € wwo, 1) and żidg(x)/dxz = y(x), 
where any g(x) of wwo, 1] satisfies (24) with y(x) = idg(x)/dx. We have thus 
shown that D(A*) = “Wao, 1] and A* p = idg(z)/da. It is clear that W[0, 1] 
is wider than wwo, 1]. It is easily verified that A* is not symmetrie on *D(A*). 

Let us show that a bounded inverse exists on R(A) (whence it N follow 


that R(A) is a subspace). Let p(x) € D(A). Then p(x) = (1/i) i Ag(x) dz, 
and, by Buniakowski’s inequality, 


1 x 
lel? = J p Ag(x) dz |’ dx < ||Ag||. 
0 


It follows from this that A`! exists on R(A) and || A~1 || < 1. 

We must discuss the possiblity of different self-conjugate extensions of the 
operator A. We know that, given any symmetric extension A, we have 
A c Ã S A*, ie. given this extension, we must add to D(A) elements of 
D(A*), and take Az = A*z on these added elements z. It should be borne in 
mind that the symmetry of the operator is not lost during the extension. 

Let us write H as 


H=R(A) QU. 


By the theorem of [185], U consists of the zeros u(x) of the conjugate operator. 
But A*u = idu(x)/dz, so that u(x) = const. We try to extend A in such a way 
that R(A) is extended to H and the operator remains symmetric. This is done 
by taking all the solutions of the equations A* pọ = const. and choosing those 
among them on which the operator D is symmetric. 

Obviously, g(x) has the form g(x) = C,(z + C). We choose constants C and 
C, so that D is symmetric on p(x), i.e. so that 


0 = (De, p) — (p, Dp) = ip(x) pæ) X23 = dC, [A + 0) + 0) —CC] = 
=i0,/? [1 +0 +0]. 


Hence we have arbitrary C,, and C = — 1/2 + £i, where £$ is an arbitrary 
real number. We associate with D(A) the elements p(x) = O,(x —1/2 + $i), 
where # is a fixed real number, and C, is an arbitrary complex number. Let 
D(A) denote the set obtained, and 4 the operator D on it. It is easily seen that 
A is a symmetric extension of A. 

On the other hand, the domain of values of A is, by virtue of its construc- 
tion, the whole of H, so that A is a self-conjugate extension of A [187]. The 


546 HILBERT SPACE [188 


added elements of this extension are easily seen to satisfy the boundary condi- 
tion 
1 2 
F + fe 
p(l) = a es (0), i.e. p(l) = e8 p(0) (0< 4 < 2x). (25) 
-7 + Be 


Obviously, the elements of D(A) also satisfy this condition. On the other 
hand, we have for any element g(x) of D(A*), satisfying condition (25): 


1 l 
. dy(x)—— , l . dw(x) 
J i Gg We) da = ; p(x) i —~ — dx 
ò ò 
for arbitrary w(x) of D(A). Therefore such a g(x) € D(A*) and A* p = idg(x)/dx. 
But A is a self-conjugate extension of A, ie. A* = A, and D(A) therefore 
consists of all the elements of D(A*) that satisfy (25). It follows from what 


has been said that wwo, 1]= D(A) consists of all absolutely continuous func- 
tions g(x) that vanish at the ends of the interval and have dg(x)/dz from Z,[0, 1]. 

We have formed all the possible self-conjugate extensions of A for which 
R(A) = H. Each of these extensions is defined by an arbitrary real para- 
meter 8, or what amounts to the same thing, by a real number 0 varying be- 
tween the limits 0 < 0 < 22. 

Let us also consider the possibility of self-conjugate extensions A’ of A such 
that R(A’) = R(A). 

If such an extension exists, the D(A) for it is filled merely by zeros of the 
operator A*, i.e. by elements (x) = const. The operator D on the set D(A) + 
+ const. is a symmetric operator A’, which is an extension of A. The set D(A’) 
can be characterized by the fact that it consists of all the elements (x) € 
€ D(A*) for which 9(0) = 9(1). 

Let us show that D(A’*) = D(A’). Let p(x) € D(A”), i.e. given any w(x) € 
€ D(A’), let 

0 = (A’ w, p) — (w, Y). (26) 


But D(A’) € D(A*), so that 


1 1 aes 

, dela) — ; one. 
(A’ w, p) = | i nla) g(x) dx = f w(x) i SO dg + iw(x) p(x) i , 
0 0 





whence, in view of (26) and the fact that w(x) € D(A’), we have p(x) = idp(a)/da 
and (0) = 9(1), i.e. p(x) € D(A’). We have thus shown that A’ is a self- 
conjugate extension of A. If we compare the boundary condition ¢(0) = (1), 
to which the functions of D(A’) are subject, with condition (25) for our earlier 
extensions, it will be seen that the former corresponds to @ = 0 (or what amounts 
to the same thing, B = œ). 

We have thus exhausted all possible self-conjugate extensions of the opera- 
tor A. In addition to these, A has various non-self-conjugate extensions, but we 
shall not be concerned with them. 

The self-conjugate extensions A, corresponding to 0 Æ 0, have bounded 
inverses A-!. For, if Ay = idy(x)/dz = 0, then y(x) = C and, by (25), we must 


188] EXAMPLES OF UNBOUNDED OPERATORS 547 


have C = eË C, i.e. C = 0. Hence it follows that A-! exists, and, since it is 
defined on the whole of H and is a self-conjugate operator, it is in fact bounded 
[186]. 

The operator A’ has no inverse on R(A’) = R(A). 

2. The operator D = id/dx in space H = L,(—oo, +0). 

Let A denote the differential operator D, defined on continuously differenti- 
able finite functions p(x). We can easily see that it is symmetric and that D(A) 
is dense in H. 

Let us consider the conjugate operator A*. The function y(x) € D(A*), if, 
given any g(x) € D(A), we have 


+00 


+00 
| SP v@ar= [ omy ae, (27) 


— 0 





where y*(x) € La(—œ, +00), and y*(x) = A* y(x). But it follows at once from 
thefirst definition of generalized derivative that D(A*) is the set ww (—œ, +20), 
i.e. the set of functions of L.{—o.0, -+20), absolutely continuous on every finite 
interval, having a derivative in L, (—co, +-c0), and y*(x) = Dy(x). Let us show 
that, if p(x) € WO)(—co, +), then g(x) — 0 as æ — + œ. It follows from the 
obvious equation 

x 


x — 
La e= aae S BO peja f ote) EE ae 
a 


a 


and the fact that p(x) and dy(x)/dx € L, (—co, +00), that | p(x) | has a finite 
limit as x — + œ and that this limit must be zero. 

Let us now investigate A**. The function y(x) € D(A**), if (27) is satisfied 
for every g(x) € D(A*), where p*(x) € L(—œ, +o) and y*(x) = A** y(x). 

On recalling that A c A* and A** S A*, we can assert that every function 
p(x) of D(A**) must belong to D(A*), i. WW (— co, + o0), and A** y(x) = 
== Dy(x). On the other hand, (27) is easily proved, by assuming that g(x) € 
€ D(A*), y(x) € Wi) — oo, +œ) and y*(x) = idy(x)/dx. For, integration by 
parts over any finite interval gives 


b 
f: 
a 


On observing that g(x) and y(x) + 0 as x — + œ, we arrive at (27) in the 
limit as a — +c and b—- — oo. It follows from what has been said that 
A** = A*, i.e. A* is a self-conjugate operator. But we know that A** = A, 
so that the closure of A leads to the self-conjugate operator A*. 

3. The operator D = id/dx in space H = L,(0, + œ). 

Let A be the operator D on all continuously differentiable functions, finite 
at infinity and in the neighbourhood of æ = 0. It can be shown with the aid 
of arguments precisely similar to the above that A* is the operator D, D(A*) 
is the set of functions g(x) of H, absolutely continuous in any finite interval 
[0, a] with a derivative in Z,(0, + œ) and D(A) = D(A**) is the set of func- 


di f —— |x=b 
YO) de + ip(a) p) 








dg(x) 
d 


T x=a 


b 
va) dx = | pa)i 


548 HILBERT SPACE [188 


tions p(x) of D(A*) that satisfy ọ(0) = 0, where Ag(zx) = idp(x)/dx. Hence 
D(A*) is wider than D(A), and A is not a self-conjugate operator, since (4)* = 
= A*, Let us show that A has no self-conjugate extensions. Let A be such 
an extension. We have: Ay = idg(x)/dz, since A S A*, and D(A) must be 
wider than D(A). But we shall prove shortly that, if p(x) € D(A), then g(x) € 
€ D(A), and this contradiction shows that A has no self-conjugate extensions. 

Let p(x) € D(A), and hence g(x) € D(A*). The formula (Ag, p) = (p, A) leads 
us to the equation 


+e +o _ 
pe ae aa ._dy(2) 
f: dz g(x) dz = fomi dz dx, 
0 








0 


from which it follows, since g(x) — 0 as x — +, that (0) = 0, i.e. p(x) € 
€ D(A). 

4. The operator — d?/da? + q(x) in space Z,[0, 1). 

Let g(x) be a real function, continuous in the interval [0, 1], and A the opera- 
tor [—d*/dz? -+ g(x)] on the set D(A) of all functions p(x) with the following 
properties: g(x) and dy(x)/dxz are absolutely continuous for x € [0, 1], (0) = 
= 9(1) = 0 and d? g(x)/dx? € £,(0, 1). It is easily shown that A is a symmetric 
operator and D(A) is dense in H. 

We shal] assume that g(x) is such that the equation —y” + g(x)y = 0 has 
no solutions vanishing at « = 0 and x = 1, apart from the trivial y= 0. Let 
us show that R(A) = H, whence it follows that A is self-conjugate. 

Let f(z) € L0, 1). We have to show that a function of D(A) exists for 
which Ag(x) = f(x). 

We introduce the function 


x t l t 
v(x) = — f I f KT) dr]dt + z f [ $ f(x) dr) dt, 
0 0 0 o0 


which obviously belongs to D(A), where —y’(x) = f(x), and let w(x) be a 
solution of the equation 


— w (x) + g(x) w(x) = — g(x) y(x), 


satisfying the condition w(0) = w(1) = 0. Such a solution (with continuous 
derivatives up to the second order) exists [IV; 173]. It is easily verified directly 
that the function (x) = y(x) + w(x) belongs to D(A) and Ag(z) = f(x). 
This is what we wanted to prove. Hence A is self-conjugate. The operator A 
has a bounded inverse (IV; 173]: 


1 
A`! f(x) = — J G(x, t) f(t) dt, 


where G(x, t) is the Green’s function of the operator A with the boundary 
condition w(0) = w(1) = 0. The above condition, that the solutions of —y” -+ 
q(x) y = 0 vanish at « = 0 and x = 1, is fulfilled if say g(x) > 0. 

5. The operator D* = (i)* 0/02, .-. 02, in space H = LD), where D 
is a bounded domain in Ry- 


188] EXAMPLES OF UNBOUNDED OPERATORS 549 


Let A be the operator D* on all k-times continuously differentiable functions 
g(x), finite in D. We know that D(A) is dense in H, and it is easily verified 
with the aid of (123) of [109] that A is symmetric. The domain of definition of 
the self-conjugate operator A* will consist of all functions p(x) € L,(D), having 
inside D a generalized derivative of the form D*, belonging to L,(D) (this follows 
from the definition of generalized derivative). It is clear that D(A*) is wider 
than D(A). A bounded inverse exists on #(A). For, let p(x) € D(A). We continue 
it by zero outside D and include D in the cube: —a < x; < a. We can now 
write p(x) as 

Xn 


Xl 
y(t) =f... | (iF Ap(ay,..., En) day, -.. dan, 
za 


whence, on using Buniakowski’s inequality, we easily obtain 


lelle) <C lello 


so that A-} exists on R(A), || 4-1]| < C and R(A) is the subspace H. As will 
be shown later in the theory of extensions of operators, there exists at least 
one self-conjugate extension for such symmetric operators. 


n 
6. The operator — A= — J 3/ðx} in space H =L,(D), where D is a 
kal 
bounded domain of Rp- 
Let A be the operator — A, defined on all twice continuously differentiable, 
finite functions in D, i.e. let D(A) = QD). The set CD) is dense in H. 
If p(x) € D(A), then 


|- ipep) az= 5 
D 


k=l 











2 
“on, Fa x, (28) 


i.e. A is positive, and hence re In addition, we know from [114] that 
for all g(x) € D(A), 
Op(x) Z 
lgliz <C | 2 3 |e) on (29) 


On, 


with the same constant C, depending only on the dimensions of D. It follows 
from (28) and (29) that 


I llo < ©? Il AP llao) » (30) 


i.e. A is positive definite, and the bounded inverse A-1 exists on R(A), with 
|| 472 || < 0? and R(A) = R(A). 

As will be shown later, such operators A admit of self-conjugate extensions, 
where each extension corresponds to some boundary value problem for the 
Laplace operator. Here we shall explain the structure of D(A) and D(A*). 
We take p(x) € D(A). We can easily show by integration by parts that 


fiarata f 
D 


D i,k=1 


32 
Ox; OX, 





2 
T, (31) 








from which it follows, if (28) and (29) are taken into account, that 


ile lwo) < C1 |] AP(Z) Hw), (32) 


550 HILBERT SPACE [ 188 


where C, is a constant depending only on the domain. Now let 9,,(x) € D(A) 
Pml) => p(z) and Apmlz) = Ap(x). It follows from (32) that the Pml) 
now converge to p(x) in the norm of W®)(D), so that p(x) belongs to WEND) 
and Ap = —Ag(x). We wrote W)(D) for this completion of D(A) in [113], 


so that D(A) = WOD). 
Now let p(x) € D(A). This implies that a function y(x) € H exists such that 


{ —so(a) pE) de = | w(x) pa) de (33) 
D D 
for all w(x) € D(4). We form the Newtonian potential 


ue) = F Trara dy, (34) 


where k, is the area of the unit sphere in Rp. 
We know [TI; 201] that, if p(y) is continuously differentiable, u(x) is twice 
continuously differentiable in D and 
— u(x) = y(z). (35) 


We show next that, if p(y) € L,(D), then u(x) = WD). To do this, we 
extend p(y) by zero outside D, form the average y,(y) and consider the func- 
tions 

u(t) = Ș x ter dy. 


They are twice continuously differentiable and satisfy 
Ta Au,(2) ET Yol) (36) 


As @— 0, u,(z) and 0U,(x)/0x, are convergent to u(x) and @u(x)/Ox, in the 
norm of Z,(D,) [115], where D, is any bounded domain. We shall assume that 
D lies strictly inside D,. By (36), we can say that, for v(x) = u,(x) — u,,(x), 


{| sofa) |? Oa) da = f | yp(a) — p(x) |? (2) de , (37) 
Ra Rn 
where ¢(x) is a fixed non-negative, twice continuously differentiable function, 


(such functions can be shown to exist) equal to unity in D, to zero outside D,, 
and satisfying everywhere the condition: 


talz) < C, O(a). (38) 
By using the formula for integration by parts, we can transform (37) to 


0? v(x) 
dx; Ox, 








L ftt x£) dx + 
p, k= 


l 


f PalT) = Yp(x) |? C(x) dz = f > 
D, 


T. 





S v(x) v(x) (r) gy B(s) za 
ee Ox, Ox; Ory, Oxy Ë, oa az; Arg aa 4 


188] EXAMPLES OF UNBOUNDED OPERATORS 551 


Hence, on using inequality (38) and the inequality | 2ab| < e|aļ|? + 
+ |b |?/e, valid for any ¢ > 0, we find that 


Le * tar) da < flv — vela) Fete) de + 
i, k=1 
D? D 


n 
+0, ff: y ‘22 
D, i, k=1 


We take £ = 1/20,. It now follows from the last inequality that 


0? v(x) 
Ox; 02g 











8 v(x) 
Ox; Ox, 


e(r) 
“Oxy 
































1 no | do(s) |2 ` 
z [2 Oa; Oa, C(x) dz < Jive P(T) — Yo-(x) |? C(x) da + 
D, D, 
v(x) |2 
+ 203 2, Bag , 


whence we can conclude that, as ọ and g’— 0, the function v(x) = u(x) — 
— u,,(z) tends to zero along with its first and second derivatives in the norm 
of LD). 

Consequently the limit function u(x) of u (x), defined by integral (34), 
belongs to W®#(D) and satisfies equation (35). We therefore have for it: 


f — sola) ua) de = f o(a) ple) de 
D 
for any w(x) € D(A). On subtracting it from (33), we get the identity for (x): 
f — å w(x) [p(x) — u(x)] dz =0. 
D 


We showed in [119] that this identity implies that the function g(x) — u(x) 
is harmonic, there being a corresponding identity for any function harmonic 
in D. 

We have thus obtained the following form for g(x) € D(A*): 


1 vy) 
pa) = 5 oye te) 
D 


where y(x) € L,(D), and v(x) is a function harmonic in D, and since g(x) and 
the integral belong to Z,(D), v(x) € £,(D) also. It is easily shown that any func- 
tion g(x) of this form belongs to D(A*) and that 


A* p = — Ag(z) = y(x). 


n 
7. The operator — A = — % @?/ðx} in space H = L,(R,). We shall start 
k=1 


by considering the operator — å + AH, where å is any given real number, 
instead of — A itself. Let us take a positive å, say equal to unity. As will be 
seen below, this guarantees the existence of a bounded inverse of — A + E, 


552 HILBERT SPACE [188 


and thereby simplifies the problem of the self-conjugate extensions of the 
operator. 

Thus, let A be the operator — A + E, defined on twice continuously differenti- 
able finite functions g(x). We know that D(A) is dense in H, and it is easily 
seen that — A + E is symmetric. It follows from the discussion of the previous 
example that the elements of D(A) and D(A*) have generalized derivatives up 
to the second order, square summable over any finite domain D of R. The 


n 92, 
operator A* is evaluated on D(A*) as the differential operator — >) ree) + p(x). 

k=1 

Let us show that D(A) = W?(R,). Firstly, D(A) c W2(R,), eee, if 

Pm(x) € D(A) and pmix) => plx), Apm (x) =» Ap(x), it follows from (28) and 
(31) that 

2 

| dz = 


n, | dlp (2) — Pm 
[alma 


= fi- A(Py (£) — Pm (2)) (Pi (2) — Pm (2)) + [M(H (£) — Pm (x)) |] dz < 





n 


2 


i,k=1 


(Pı (1) — Pm(2)) 
Ox; Ox, 














<A — Pll LaRn) * Pr — Prill LRA) + AG — Pm)|I? LR) > O 


asl and m —> oo. 


Let us prove the reverse inclusion, i.e. that W(R,) © D(A). Let p(x) € 
€ WE Rn ). We form a sequence of functions yn d2) = Pıjml2) m(x), where 
Pum(«) is an averaging of p(x) with radius 1/m, and ¢m(r) is a twice continuously 
differentiable finite function defined for r > 0 as follows: 


1 for O< r<m, 
Cm (r) = 0 for r>m+1, 
Em- (r — l)form <r <m+4+]l, 


and (r) < 1 is a smooth non-negative function, equal to zero for r > 1. 
Each ym(x) € D(A). Let us show that y,,(x) — p(x) as m — œ in the norm of 
W)(R,,). On splitting into two parts the integration over R, in the expression for 
the norm of WEXR,): the part where |x| < l, and the part where |z] > l, 
we obtain 


lo — ¥mllw Qn = Ile — Yells ixin + IP — Ym llw azn 


Given e > 0, the second term can be made < e/2 for all m, if we take l 
sufficiently large. This follows from || 9 liwir, < + œ, from the formation 
of the m(x) and the properties of the average of p(x) and its generalized deriv- 
atives. We fix I in the manner indicated. When m > l, we have y,,(x) = 
= Pym(x) for |x| < 1 and Pıym(x) — p(x) in the norm of WR (| a2| < 1). 

It follows from this that, for all sufficiently large m, 


lo — PallwPxicy < -y and [lp — YmllwP Rn) < E 


The inclusion W2(R,) S D(A) follows from what has been said, and so we 
can take it as proved that D(A) = W®(R,). Let us show that A = A*, ie. 


189] THE SPECTRUM OF A SELF-CONJUGATE OPERATOR 553 


that A is a self-conjugate operator. It is sufficient to show that the domain of 
values of A is the whole of L,(R,). We shall take n = 3 for simplicity. We take 
any finite continuously differentiable function g(x) € L.(R,). The function 


1 f oY p(y) 
d 


u(x) = Ta = yl 
R 





> 


n 


is twice continuously differentiable, satisfies —Au + u =ọ and decreases 
exponentially as |2|— co together with its derivatives [IV; 231], so that 
certainly u(x) € W2(R,). Consequently, (u(x) € D(A)), Au = @ and the domain 
of values R(A) of A is dense in H. 

Let us show that a bounded inverse A! exists on R(A). It will follow from 
this that R(A) is a subspace, i.o. R(A) = H 

Thus, let v(x) € D(A) and —Av(x) + v(x) = y(x). We multiply this equation 
by v(x) and integrate over R,, after which we use integration by parts: 


fram z) dz = Í [3 Pie yp | de. 


Ra 














This gives us, in view of Cauchy’s inequality, 


lela < [Ileana = "Allen 


i.o. A`! in fact exists on R(A) and || 47? 1| < 1. 

This completes our proof that A = A*, or, what amounts to the same thing, 
that the closure of A amounts to its (unique) self-conjugate extension, where 
D(A) = W(R,). This also holds for — A = A — E,i.e. the operator — A is 
self-conjugate on W(R,). However, as distinct from — 4 + E, — A has no 
bounded inverse (it is easy to show that the inverse of — A exists, but is not 
bounded). 


189. The spectrum of a self-conjugate operator. It may be shown 
as in [128] that the eigenvalues of a self-conjugate operator are real, 
whilst the eigenelements corresponding to different eigenvalues are 
orthogonal, and a self-conjugate operator generates an orthonormal 
system of eigenelements. Notice that, since a self-conjugate operator 
is closed, the eigenelements corresponding to a fixed eigenvalue 
(including the zero element) form a subspace. 

The definitions of a regular point of an operator and of a point of 
the spectrum are the same as in [129]. We shall now prove, for a self- 
conjugate operator, analogues of the theorems of [129]. 

THEOREM 1. If A is not an eigenvalue of a self-conjugate operator A, 
the lineal R(A — AE) is dense in H. 

If å is real, A — AE is a self-conjugate operator, and the theorem is 
a consequence of Theorem 1 of [187]. Let 4 not be real. If we were to 


554 HILBERT SPACE {189 


have R(A — AE)# H, there would exist a non-zero element z, ortho- 
gonal to R(A — AE), i.e. 


((A — 2E) xz) =0 or ((A — AE) x, z) = (2,0). 


It follows from this that z € D(A) and (A — 4E)*z = (A — AE)z = 
= 0, ie. Az = dz, which is absurd, since A can only have real eigen- 
values. 

THEOREM 2. The necessary and sufficient condition for À to be a regular 
point of the spectrum of a self-conjugate operator A is that there exist a 
positive number p such that 


(A — 2B) 2|| > p|la|| (e€D(A)). (39) 


Non-real values of A are regular points of a self-conjugate operator. 

The proof is the same as in [129], except that we have to use the 
fact that A is closed, instead of its continuity. 

The following corollary may be obtained, as in [129]: points of the 
spectrum form a closed set. 

It will be seen later every self-conjugate operator has at least one 
point in its spectrum. 

Non-real å are regular points of the self-conjugate operator A, so that 
in future we shall only speak about real 4. In this case the operator 
(A — AEB) is self-conjugate. 

Suppose that / is not an eigenvalue. If R(A — AE) = H, the closed 
operator (A — AE)-1, defined on the whole of H, is bounded, i.e. A is 
a regular point. 

Conversely, if R(A — AE) is not H, it follows from Theorem 3 of 
[187] that (A — AH)-1 is unbounded on R(A — AE). We arrive at the 
following theorem: 

THEOREM 3. Let the real A not be an eigenvalue. If R(A — AF) = H, 
then 2 is a regular point, and if the lineal R(A — AE) is not the whole 
of H (this lineal is dense in H), À is a point of the spectrum. 

We now take the case when å is an eigenvalue, and we write P, for 
the subspace of corresponding eigenelements (including the zero 
element). A unique operator exists, for which P, = H; this is the 
operator of multiplication by the number A: Az = Az for all x € H. 
In the remaining cases the subspace P; is a regular part of H. 

Let us now show that P, is the set of elements x orthogonal to the 
lineal (A — AE)z (z € D(A)). 

For, since (A — AEB) is self-conjugate, it follows from ((A—AE)z, x)= 
= 0 that x € D(A) and (A — AE)x = 0, and conversely, if x € D(A) 


189] THE SPECTRUM OF A SELF-CONJUGATE OPERATOR 555 


and (A — AE) x = 0, then ((A — AE)z, x) = 0. It follows from this that 
the subspace Qı, complementary to P,, 


H=P,@0Q, 
is the closure of the lineal of elements y defined by 
y=(4—1E)z (2€ D(A), 


i.e. Q, = R(A — AE). We write the element z of D(A) as z = z, + 2,, 
where 2, € P, and 2, € Qı. It follows from the definition of P, that 
2, € D(A) and Az, = åz, so that z, = 2 — 2, € D(A) also, i.e. the 
projection of the lineal D(A) on Q, is a lineal of D(A). Let us denote 
it by D,(A). This is obviously the lineal of the elements of Q, that 
belong to D(A). The operator A is defined on this lineal, and it may 
easily be seen that, if y € D,(A), then Ay € Qı. For, (y, £) = 0 by 
hypothesis, if 2 € D(A) and satisfies the equation Ax = dz, and 
hence (Ay, x) = (y, Ax) = (y, Ax) = 0. By what has been said, we can 
regard A as an operator in the subspace Q}, which we can look on 
as a new space H. Let A, denote this operator, so that A, is defined 
on D,(A) and A, y = Ay if y € D,(A). Following the usual notation, 
we can write D(A,) instead of D,(A). 

It follows from a general theorem which we shall prove in [191] that 
D(A,) is dense in Q; and that A, is a self-conjugate operator in Q,. 
By virtue of the actual construction of Q;, å cannot be an eigenvalue 
of A,, but it may be either a regular point or a point of the spectrum 
of this operator. In the former case, (A, — AE)z (z € D(A,)) transforms 
D(A,) into a complete space Q,, and in the latter case into a lineal 
dense in Q,. We can write (A — AE)z instead of (A, — AE)z. We have 
(A — AE)z = 0 for all z of Pj. 

The above discussion leads to the following classification of values 
of A. 

I. Regular values of A, which are characterized by the fact that 
R(A — AE) = H, and the existence of a bounded inverse (A — AF)-}. 

II. Values of å for which R(A — 2E) is a lineal different from H, 
and &(A — AË) = H. The inverse unbounded operator (A — AE)-1 
exists on R(A — 2E). We usually say that such values of A belong to 
the continuous spectrum. 

III. Values of à which are eigenvalues of A, and for which A, has 2 
as a regular point. For these values, R(A — 1E) is a subspace, not the 
same as H. It is generally said of these 2 that they belong to a point 
spectrum only. 


556 HILBERT SPACE [190 


IV. Values of å which are eigenvalues of A and for which A, has A 
as a point of its spectrum. For these values, R(A — AE) is not a sub- 
space, but is a lineal, the closure of which R(A — AE) is a subspace, 
not the same as H. We say of these A that they belong simultaneously 
to a point and continuous spectrum. 

Certain types of values of A may be absent in the spectrum of a 
self-conjugate operator A. But we have shown that the spectrum of 
a bounded self-conjugate operator contains at least one point. It is 
easily seen that the same is true for an unbounded self-conjugate 
operator A. For, suppose that (A — 2E) has a bounded inverse for 
any real A. We have the obvious equation 


4 (4—AE) AA =4— A- (440), 
where both sides represent a self-conjugate operator. It follows from 
this equation and the proposition in question that R(u — A-) is the 
whole of H for any real u, whilst this contradicts the fact that the 
bounded operator A-! has spectral points. 

It will be shown below that an unbounded self-conjugate operator 
has an infinite set of spectral points, distributed outside any fixed 
interval of the A axis. 


If 4 is not an eigenvalue of A, the operator 
Ry, = (A — AB), (40) 


is called the resolvent of A, as we know. It is defined on R(A — AE) 
and transforms this lineal one-to-one into D(A). It follows from the 
definition of inverse operator that, if  € D(A) and R; æ = 0, then 
x = 0 (cf. [144]). As in the case of bounded operators, we have 


R, = Ri. (41) 


If å is a real number, this follows from the fact that R, is self- 
conjugate. When 4 is complex, it follows from the equations (A — AE)* 
= A — JE and [(A — AE)-]* = [(A — AE)*]-1. If à and u are 
regular values, it may be shown, precisely as for bounded operators, 
that [144]: 

R,— Ry = (u — 2) R, Ra- (42) 


190. The case of a point spectrum. A self-conjugate operator A is 
said to have a point spectrum if the orthonormal system of its eigen- 
elements is dense in H. Let x, (k = 1, 2, ...) be this system, enumer- 
ated in some manner, and let 4, be the corresponding eigenvalues: 


190] THE CASE OF A POINT SPECTRUM 557 


Axy = Ay £y. By hypothesis, any element x € H is expressible by its 
Fourier series: 


= È Oy ty, (43) 


THEOREM 1. The necessary and sufficient condition for x to belong to 
D(A) is that the series 


Zan? (44) 
be convergent, and tf this condition is fulfilled, 
Ax = > Ay dy ty. (45) 
k=l 


If x € D(A), a Fourier coefficient of the element Az is (Az, £) = 
= (x, Ary) = (x, Ay £k) = Ay ak, Whence follows the convergence of 
series (44). Suppose conversely that (44) is convergent, and let us 
show that x € D(A). Since (44) is convergent, we can form the element 


a! = Sy dy Le (46) 
ial 


and, on writing yn for a segment of series (43), we have yn € D(A), 
Yn => wand Ay, =» 2’, whence it follows, since A is closed, that 
x € D(A) and Av = x’. The theorem is proved. 

We have already seen that a completely continuous self-conjugate 
operator has a purely point spectrum, where A, —> 0 as k — oo for any 
enumeration of the eigenvalues. We must mention a further important 
case of a self-conjugate operator that has a point spectrum. Let A be 
a self-conjugate operator with a completely continuous inverse A-t, 
By definition of the inverse, the equation A-tz = 0 has no non-trivial 
solutions. The completely continuous self-conjugate operator A-1 has 
a point spectrum, and its eigenvalues up (kK = 1,2,...) can be 
enumerated in order of non-increasing absolute value: | į | > | 4 | > 
> ..., where yu, #0 for all k, by what has been said above. On 
writing 2, for the corresponding eigenelements that form an ortho- 
normal system (complete in H), we can write A-lz, = up £g, whence 
it follows immediately that Ax, = Ay £g, where Ay = 1/ ug. On recalling 
what was said in [136], we arrive at the following theorem: 

THEOREM 2. If a self-conjugate operator A has a completely continuous 
inverse A-}, A has a point spectrum, all its eigenvalues are of finite 


558 HILBERT SPACE [191 


rank and any finite interval contains only a finite number of eigenvalues 
of A. 

It follows from what has been said that the eigenvalues 4, of such 
an operator A can be enumerated in order of non-decreasing absolute 
value: | å | < |å | < ..., where | A, |—> +œ as k— œ. 

Values other than eigenvalues may belong to the spectrum of an 
operator with a point spectrum. For instance, if Ais a point of condens- 
ation of A,, it must belong to the spectrum, since the regular points 
form an open set. Let us show that no other å can belong to the spectrum. 

THEOREM 3. If a self-conjugate operator A has a point spectrum, every 
real à, different from an eigenvalue and not a point of condensation of 
these values, is a regular point of A. 

There exists by hypothesis a positive number m such that | A,—A| > 
> m for all k. Let x € D(A). It follows from (43) and (45) that 


no co 


|(A — 2E a) P= È |h — AP lay? > m Bay |? = mle, 


am 
k=1 





whence the point 4 must be regular. In other words, A has a purely 
point spectrum in the present case. 

If a self-conjugate operator A has no eigenvalues, we say that it 
has a purely continuous spectrum. In this case, instead of any element 
x of H being expressible as a Fourier series (43), it is expressible as a 
sum of integrals [cf. 149]. We shall show below that a point spectrum 
can be extracted from the purely continuous spectrum of a self-con- 
jugate operator, just as, in [189], we extracted one eigenvalue from 
the remaining part of the spectrum. Certain new concepts must be 
introduced in this connection. They are of interest in themselves for 
the theory of operators. 


191. Invariant subspaces and the reducibility of an operator. 
Before introducing the concept of invariant subspace, we must discuss 
the commutation of an operator B defined on the whole of H with an 
operator A defined on only part of H. 

DEFINITION. A bounded operator B, defined everywhere, ts said to 
commute with an operator A when the following conditions are fulfilled : 
(1) if x € D(A), then Bx € D(A); (2) if x € D(A), then BAx = ABr. 

If A is bounded and specified everywhere, the first condition falls 
out, and we have the earlier definition of commuting operators. 

THEOREM 1. A necessary condition for B to commute with a self- 
conjugate operator A is that it commute with the resolvent R; for any regular 


191} INVARIANT SUBSPACES AND THE REDUCIBILITY OF AN CPERATOR 559 


value of à. A sufficient condition is that it commute with R, for at least 
one regular À. 

The commutation of B and R, (with regular A) is an ordinary com- 
mutation, such as was defined earlier (BR, x = R, Bz for x € H). 
Let B commute with A, 2 be any regular value and y any element of H. 
Now, R, y and BR, y € D(A), and we have in addition: 


ABR,y = BAR, y. (47) 


But (A —AE)R,y = y, whence AR, y = (AR, + E)y, and (47) 


can be rewritten as 
ABR,y =ABR,y + By; i.e. (A—AE)BR,y = By. (48) 


On applying R; to both sides of the last equation, we get BR, y = 
= R, By, and the necessity is proved. 

Now let a regular 4 exist such that BR, y = R, By. The form of 
the right-hand side implies that both sides belong to D(A) for any y€ H. 
If y runs over the whole of H, x = R, y runs over the whole of D(A), 
and the equation shows that, if v € D(A), then Bx € D(A) also. On 
applying the operator (A — 2E) to both sides, we get (48) and (47), 
whilst (47) can be rewritten as ABx = BAx for x € D(A), and the 
sufficiency is proved. 

COROLLARY. If B commutes with R, for any one regular A, it commutes 
with R, for all regular 2. 

We now turn to the definition of invariant subspace. 

DEFINITION. A subspace Lis said to be invariant under the operator A 
if the following condition is satisfied: if x € D(A), and x € L, then 
Az € L also. 

If Lis a subspace invariant under A and D,(A) isa lineal of elements x 
belonging simultaneously to D(A) and L, A induces into L an operator 
A, which is defined on D,(A) and is equal to A. The subspace L can 
be regarded as a Hilbert space (it may be finite-dimensional). As we 
shall see below, it is essential that not only Z, but the complementary 
subspace H © L also be invariant under A, and that the projection of 
any element x € D(A) on L also belong to D(A). This leads us to the 
definition: 

DEFINITION. A subspace L is said to reduce an operator A when the 
following conditions are fulfilled: (1) L and M = H © L are subspaces 
invariant under A ; (2) if xe D(A),the projection of x into L belongsto D(A). 


560 HILBERT SPACE [191 


We shall in future write Px for the projector onto the subspace K. 


We have 
x = Pix 4 Pye. (49) 


If z € D(A), it follows from the definition that P, x € D(A), and 
hence Pm x € D(A), i.e. if L reduces A, M also reduces A. Let A, and 
A, be the operators which A induces into L and M. We now have, for 
any x € D(A), 


Az = A, (Pix) + A, (Py2), (50) 


and we have thus split A into operators A, and A, acting in Zand M. 

THEOREM 2. The necessary and sufficient condition for the subspace 
L to reduce the operator A is that P, and A commute. 

Let us first prove the necessity. Let L reduce A. By the definition 
of reducibility, if x € D(A), then PŁ x2 € D(A). It remains to show 
that P; Ax = AP, x for x € D(A). We have (50), where A,(P, x) € L 
and A,(Py x) € M. On applying the operator Pz to both sides of (50), 
we obtain 

P,Az = A, (P£) = AP,2, 


which is what we set out to prove. 

Sufficiency. Since P} and A commute, P, x € D(A) if x € D(A). 
It remains to show that, if x € D(A) and gv € L, then Az € L and 
similarly for M. The first follows immediately from P, Ax = AP, z, 
the left-hand side of which obviously belongs to L, whilst the right- 
hand side can be written as AP, x = Az, since x € L. The same is 
true for M, since A commutes with Pm if A commutes with P,. 

We now turn to the case when A is self-conjugate. 

THEOREM 3. A sufficient condition for a subspace L, invariant under a 
self-conjugate operator A, to reduce this operator is that x € D(A) 
implies P, x € D(A). 

We have to show that the conditions of the theorem imply that, if 
y € D(A) and y € M, then Ay € M also. 

We have for such an element y, and any x € D(A): (PL Ay, x) = 
= (y, AP, x). But AP, x € L by hypothesis, whence (y, AP, x) = 0 
and hence (P, Ay, x) = 0 for any x € D(A). But the lineal D(A) is 
dense in H, whence P; Ay = 0, i.e. Ay € M; this is what we wanted 
to prove. 

Let D,(A) denote as above the projection of D(A) into L, i.e. the 
lineal of the elements of L on which the operator A is defined, and A, 
denote the operator which is induced by A into L. 


191] INVARIANT SUBSPACES AND THE REDUCIBILITY OF AN OPERATOR 561 


THEOREM 4. If a subspace L reduces a self-conjugate operator A, D(A) 
is dense in L and A, is a self-conjugate operator in L. 

Let y be a given element of L and £ > 0 a given number. We have 
to show that there exists an element x € D(A) such that || y — æ || < 
< e. The lineal D(A) is dense in H, so that there exists an element 
z € D(A) such that || y — z || < e. All the more, || Pp y — Pr z || < e. 
But Pry = y and P, z € D, (A), and the first statement of the theo- 
rem is proved. It remains to show that, if 


(A,x, y) = (x, y*), (51) 


for any x € D,(A), where y and y* € L, then y € D,(A) and y* = A, y. 
We can put x = P, z, where z is any element of D(A), and we get 
(AP, 2, y) = (PL 2, y*) or, by Theorem 2, (P, Az, y) = (Pi z, y*), 
whence (Az, Py y) = (2, PL y*) and (Az, y) = (z, y*), since y and 
y* € L.Since A is self-conjugate, the lastequation shows that y € D,(A) 
and y* = Ay = A, y, and the theorem is proved. We made use of 
this theorem in [189]. 

Let L, (k = 1,2,...) be mutually orthogonal subspaces and L 
their orthogonal sum [139]: 


AAN A T 3: 


THEOREM 5. If the mutually orthogonal subspaces Ly reduce a closed 
operator A, their orthogonal sum also reduces A. 

We shall prove this for the case of an infinite number of terms. We 
have to show that the operators Pz and A commute. Let Q, be a pro- 
jector, equal to the sum of the first n of the P,,. If x € D(A), since 
the L;, reduce A, we have Qax € D(A) and AQ, x£ = Qn Az. But 
Qux => PL « and AQ,x2=@, Ax => P, Ax, whence, since A is 
closed, P} x €D(A)and AP, x =P, Az, which is what we wanted to 
prove. Let A be a self-conjugate operator, A, be distinct eigenvalues 
of it, and L; the corresponding subspaces of eigenelements (including 
the zero element). The number of these subspaces may be finite. Each 
L, obviously reduces A. We form their orthogonal sum L. If L is the 
whole of H, A has a point spectrum. If this is not the case, we have 
the orthogonal decomposition of H: 


H=LO@M and z= Pz + Pyr (x€ H), (52) 


where L and M reduce A, and this operator induces into L and M the 
operators A, and A, such that Ar = A (P: x) + A,(Py x), where 


562 HILBERT SPACE [192 


A (P 2) = A(P, x) and A,(Py2) = A(Pyx) for v€ D(A). The 
operator A, has a point spectrum in L,, whilst A, has a purely contin- 
uous spectrum in M. 


192. Resolutions of the identity. The Stieltjes integral. We now turn 
to the theory of spectral functions (resolutions of the identity) for 
self-conjugate operators. This is largely analogous to the case of a 
bounded self-conjugate operator. We shall emphasize the details where 
the unboundedness of the operator has to be taken into account. 

We define a resolution of the identity as a family of projectors 
&@,, depending on a real parameter A in the interval (—œ, -+-°°) and 
satisfying the following conditions: (1) if u >A, then &, > Ez; (2) 
Z, tends to the annihilation operator as 4» —œ and g, — E as 
2 —> +œ; (3) &, is continuous from the right, i.e. Z, > Fy, as à —> A’ + 
+ 0. Here, Z, Z, = E, Za = Z, for å < u, and if A is some interval 
[a, 8], we have as before, on writing 47, = Eg — Za: 


A'E x LAET, (53) 
(4’ and A” have no common interior points) 
A'S, - A” E, = AF, (54) 


(A, is the common part of 4’ and 4”). 
Let 6 be a subdivision of the interval (— °°, +): 


site L Ag < Aa LALALA., 


where the upper bound ws of the differences 4, — 4,-, (k = 0, 1, 


+2, ...) is finite. We form the infinite sum 
te te 
> An A,B yx = D Akl ay = E a) T, (55) 


k=—co k=—oo 


where A;,_, < Ag < 4, and x is an element of H. By (53), this sum con- 
sists of mutually orthogonal elements and the necessary and sufficient 
condition for its convergence is that the series [121] 


tes +o 
> Re Ae |? = E Allal. (56) 


ka —00 k= —00 


be convergent. 


192] RESOLUTIONS OF THE IDENTITY 563 


This series is the sum a; [3] for the integral 
oo +20 
{ #d(G, 2,2) = | Pal Zazi, (57) 


and we know from [5] that, if series (56) is convergent for some sub- 
division ø and some choice of åk, it is convergent for every subdivision 
and every choice of Az. The limit of sum (56) as œs —> 0 is equal to 
integral (57) and the existence of (57) as an improper integral is equi- 
valent to the convergence of sum (56). We are therefore justified in 
considering sums (55) for the elements x for which series (56) is con- 
vergent, or what amounts to the same thing, for which integral (57) 
has a finite value. Let J denote the set of such x. On observing that 


| nF, (2 +9) |? < (| eB. 9! + || Ae Fay i)? <2 | 4,2 |? +-2 | 4. yl, 


we can say that, if x € land y € l, then z + y € l. In addition, it is 
obvious that, if x € l and a is a complex number, then az € I, i.e. 1 is 
a lineal. If x belongs to the subspace onto which the operator Z; — Za 
projects, the terms of sum (56) for which A,_, > £ or A, < a are zero, 
i.e. such x belong to J. On observing that &; — Za > E as a—> — o 
and f + +, we can say that the lineal } is everywhere dense in H. 
Further, if x €l, we can show, precisely as in [141], that sums (55) 
have a definite limit in the sense of a convergence in H as w,—> 0. 
This limit is naturally written as a Stieltjes integral, and it defines on 
the lineal J a distributive operator Aw: 


+æ 
Ar= È Ad&,z. (58) 


Let us write D(A) as usual for the lineal 7. We recall that it consists 
of the elements x for which integral (57) has a finite value. On forming 
the scalar product of sum (55) with itself, using (54) and passing to 
the limit, we obtain 


Pease =| Axl? (ee D(A) (59) 


where the integral is the limit of sums (56) as w,-—> 0, or it can be 
understood as an improper integral with infinite limits. On forming 
the scalar product of (56) with any element y and passing to the limit, 


564 HILBERT SPACE [192 


we get the expression for a bilinear functional: 
+00 
(Aa,y) = f 2d (%,2,y) (60) 


(z€ D(A): y€ H), 


and the integral here is the limit of the corresponding sums c; as 
ws —> 0. If we replace y by (fs — a)y, we obtain, by (54): 


8 
(Az, (Zs — ay) = f Ad (Z, 2, y), 
and, on passing to the limit as a —— — œ and f —> +, we have 


8 

(Aa,y) = lim f Ad (&,2,y), (61) 
beta. 

i.e. integral (60) can be understood as an ordinary improper Stieltjes 

integral, where (@, x, y) is a function of bounded variation. 

Notice that the infinite interval (—°o°, +o) has a finite measure 
with respect to the non-decreasing function || E; x ||?, and we can 
interpret (57) as the Lebesgue-Stieltjes integral of an unbounded 
non-negative function 4? over a set of finite measure [50]. 

Let x be any element of H. Now, (Z; — %,)x belongs to the sub- 
space onto which (Z; — Z.) projects, and it follows from this, as we 
saw above, that (2; — %,)x € D(A) for any choice of element v. This 
assertion does not hold for the element @, x. But, if x € D(A), i.e. 
series (56) is convergent, the fact that || Ax Z(E, x) || = || Ep 4k Fi 2 || 
< || 4, %, x || implies that series (56) is convergent when z is replaced 
by @, x, ie. if x € D(A), then „x € D(A) for any u. If x is taken to 
belong to D(A), we can replace z by &„ x in sum (55), taking u as one 
of the points of subdivision (u = åp). All the terms with k > p now 
vanish, whilst the terms with k < p remain invariable, and we obtain 
in the limit 


u 
AG a= f dex. (62) 


This integral is the limit of a sum of form (55) when the interval 
(—°°, u) is subdivided. On the other hand, if we apply to the sum (55) 
the operator @,, which is bounded and therefore continuous, what has 
been said above about the terms remains in force, and, in view of the 


192] RESOLUTIONS OF THE IDENTITY 565 
continuity of %,, we get in the limit as w, —> 0: 
B 


(z€ D(A)); 
on comparing with (62), we can write 
Z Ax = AG,x (63) 
(z€ D(A)). 


Similarly, we have for any g: 
B 
Ap — 2a) r= fid (64) 
(x€ H). 
If x € D(A), it follows also from (62) that 
B 
(Zp — Za) Av = f 4dg,x (64,) 
(x€ D (A)), 
and we obtain, on letting a > (— œ) and B > +: 
B 
f 4dg x = Ar, (65) 


i.e. integral (58), like (57), can be interpreted as an improper integral. 
It follows at once from the above formulae that 


B 
(A®,x, Y) = (Z AT, y) =f Ad (Zx, y) (66,) 


(x€ D(A); ye H); 


8 
(A (Eg — a) T, y) = f 4d (ET, Y) (663) 
(z€ H; y€ H). 


If y as well as x belongs to D(A), we obtain on substituting y instead 
of x in (58) and forming the scalar product from the left with z: 


+2 
(x, Ay) = f Ad (x, Fy). 


566 HILBERT SPACE [192 


On comparing with (60) and noting that (x, E; y) = (Z, x, y), 
get (Az, y) = (x, Ay), i.e. A is a symmetric operator. Let us now he 
that A is self-conjugate. It is enough to show that, if z € D(A*), then 
z € D(A). Let 6 be a subdivision of (—œ, +) and P; the sub- 
space defined by the projector Zy — Biy If x € Pj, all the terms in 
sums (55) and (56) vanish for k > j+ 1l and k < —j, the element 
x € D(A) and, when j + 1 > k > —j, the subspaces defined by the 
projectors 4, &, belong to P}, so that sum (55) and its limit Ax belong 
to Pj. Hence it follows that Asx € D(A) and that Ax € P;. Suppose 
that z € D(A*); let us show that z € D(A). Let z; be the projection 
of z onto P;, so that z; € D(A), i.e. z; € D(A*). The element (z — z;) 
also belongs to D(A*) and is orthogonal to P;. By definition of A*, 
(A?zj, z — zj) = (Az; A*(z — 2)). But A*z, € P; and z — zj is ortho- 
gonal to Pj, so that 
(Az;, A* (z — 2;)) =0. 
On taking into account the obvious equation 
(A*z, A*z) = (A* (z — z;), A*(z—2,)) + (A*z,, A*z)) + 
+ (A* (z—2,), A*z;) + (4*z;, A*(z — 2), 

the previous formula and A*z; = Az;, we obtain 

|| A*z ||? = || A* (2 — 2) |? + |] Az,|P, (67) 


whence 


[| Az, |P < || 4*2 |P. 


Let us consider sum (56) with s = z;. All its terms vanish for k > j + 
+ l and k < —j, whilst for the remaining terms, by (54): 


AE 2; = (Err = Fx.) (uy — Za) z = (Er — Barn)? = A,@ 2. 
Thus (56) gives in the limit, in view of (59): 
ay 
| 4z? = J 2a || Bz iP, 
as d 
and, by (67), 
ay 
$ #d || Bz |? < || A*z|P. 
as 
If we let j increase indefinitely, it will be seen that integral (57) has 


a finite value for x = z, i.e. z € D(A), so that A is a self-conjugate 
operator. 


ot 
for) 
~l 


192] RESOLUTIONS OF THE IDENTITY 


The above discussion leads to the following theorem: 

Turorem 1. For every resolution of the identity &, there is a corre- 
sponding self-conjugate operator A, defined for all the elements x for 
which integral (57) has finite values. The operator A is defined as the 
limit of sums (55), or what amounts to the same thing, as integral (58). 
The corresponding bilinear functional is defined by (60). 

The converse can be proved. 


THEOREM 2. Given any self-conjugate operator, there exists a resolu- 
tion of the identity Z, such that A is expressed by (58). 

The proof of this theorem will be given later. We shall prove below 
a formula that gives Z, in terms of A [cf. 144], where distinct A cor- 
respond to distinct 8. The operator @, is called the spectral function 
of the self-conjugate operator A. It can be shown, precisely as in 
[144], that regular points 4 = u of the spectrum are characterized 
by the fact that an interval exists, having u as an interior point, 
in which %, is constant. We have to bear in mind here that (23 — 
— @,) x € D(A) for any x € H and any finite a and £. The eigenvalues 
A = v are characterized by the fact that g, has a jump at 4 =», 
the difference &, — Z,—o being a projector onto the subspace of cor- 
responding eigenelements (including the zero element) [145]. 

Let the self-conjugate operator A be semi-bounded, and let m4 
be its strict lower bound: 


m, = inf (Awv,x) for x€ D(A) and |/z||=1. 
We have, given any real A, 
((A — AE) x, x) > (m4 — A) (x, 2) (xe D(A)), 


whence 
|| (A —- AE) æ ||- |æ |> (ma — 4) || x |? 
and 


|| (A — AE) x|| > (m, — A) || x}. 


It follows from this that all the A satisfying the condition 4 < ma 
are regular points of A. Let us show that 2 = ma is a point of the 
spectrum of A. If this were not the case, there would exist an m, > ma 
such that all the values A < m, are regular points of A, and @, is the 
annihilation operator for å < m,, so that 


+00 
(Az, x) = | 2d (&,z, 2) (z€ D(A)), 


whence it follows that (Az, x) > m(x, x) for x € D(A), and this 


568 HILBERT SPACE [193 


contradicts the definition of ma. The value 4 = ma is also known as 
the lower bound of the spectrum of A. 

The operator @, is called the spectral function of the self-conjugate 
operator A. We shall now re-iterate briefly the properties of general 
self-conjugate operators, which are precisely similar to those of bounded 
self-conjugate operators. 


193. Continuous functions of a self-conjugate operator. Let f(A) 
be a bounded function, uniformly continuous in (—œ, +2) (say 
{(A) is continuous in the closed interval [— œ, +°]). We form the 
sum, analogous to (55): 


+00 
D f(A) Ape 2, (68) 


k=—00 


where x is any element of H. It may easily be seen that this series, 
consisting of mutually orthogonal elements, is convergent for any 2. 
For, | f(A)| < k by hypothesis, where k is a definite number and 


2 +20 oo 
= S Pepe S 42/2 = 
=- k=- œ 
= kJel, (69) 


EULIA. 


k= —0co 














and the series analogous to (56) is therefore convergent. It can be 
shown, precisely as in [141], that, given any x of H, sum (68) has a 
definite limit as wœ, —> 0. This limit yields a distributive operator 
F(A), defined throughout H. It follows at once from (69) that 
\[ f(A) x || < & || @ ||, ie. f(A) is a bounded operator. It is natural to 
write the limit of sum (68) as a Stieltjes integral: 


fiAye= [pase (70) 
(z€ H) 
+0 
FAm y) = S EAEE) (71) 
(x€ H; y€ H). 


The latter integral can be interpreted as an ordinary Stieltjes 
integral, such as we defined in [4]. 


194] THE RESOLVENT 569 


We have the exact analogue of (62) and (63): 
B 
&,f(A)e =f (A) Fe =) (Ad Fa (72) 


(E Az y) =((A)B, 2.9) = FF AAEey) (13) 
(x€ H; y€ H). 


We could have applied the above definition of f(4) to the case of 
any bounded function, continuous in (—°°, ++ °°), without requiring 
its uniform continuity. We shall show later that the concept of func- 
tion of a self-conjugate operator can be extended to a wider class of 
functions /(A). 

We have the obvious formulae for an operator (A — LE), where 
l is any given number: 


boo 


A—lE)x= Í (4—) d&g; ((4—1E) z,y)= fū-na (E2, y) 


f (74) 
(c€D(A); y€H). 


194. The resolvent. Let us find an expression for the resolvent in 
terms of the spectral function. 

If l is not real, 1/(A — 2) is a function of å, continuous in the closed 
interval [— °°, +œ], and we can form the bounded operator 


Ra= fen 7 den. (75) 


Let us show that this has all the properties of the resolvent when 
à =l, which will justify us in writing it as R, Given any 2, the 
element (Fs — Za) R, x € D(A), and we have by (66,): 


B 
((A — tE) (Zg — Za) Rz, y) = f (A— 1) d (Z, Riz, y). 


On the other hand, by (73) and (75): 


a 
(E Rig, y) == Í = d (Z p5 y) . 


570 HILBERT SPACE [194 


On substituting in the previous formula and using the properties 
of the Stieltjes integral [9], we obtain 


8 
((A — IE) (Zp — &,) Ræ, y) = | d(x, y) = ((E,— Za) £, Y), 


whence, since y is arbitrary, 
(A — lE) (Ep — Ea) Rit = (Eg — Ea) T. (76) 


Let us form the two number sequences a, and n, where an > — œ 
and fn—> +œ, and the element sequence y, = (g, — Ca) Ri x. 
We have yn € D(A), Yn => Rıx, and, by (76), (A — lE) y, > x. 
Hence it follows, since A is closed, that R, x € D(A) for any x and 
(A — 1E) Rz =z. 

It remains to show that R(4 — LE) x = x for x € D(A). This fol- 
lows at once from the tonnes 


(R,(A ~ 2,9) =f r d(&,(A — LE) x,y) 


—— 00 


and 
à 


(Z, (A — 1B), y)= | (u — 4d (Z2, y), 
which are consequences of (71) and (66). 
The formula defining the spectral function in terms of the resolvent 


remains in force: 
A 


y Eror y) + (Fz y)] = lim | (Begs — Rea) ® y) do. (77) 


A definite spectral function %, corresponds to a self-conjugate 
operator, and the operator is bounded when and only when @, is 
variable only on a finite interval. 

We have shown [191] that the necessary and sufficient condition 
for a bounded operator B, defined everywhere, to commute with 
a self-conjugate operator A is that 


BR,=R,B (78) 


for any l, for which the resolvent exists. Let us now prove the fol- 
lowing theorem: 

THEOREM. The necessary and sufficient condition for B to commute 
with A is that, given any real à, we have 


BZ, =@,B. (79) 


195] EIGENVALUES 571 


It is sufficient to show that conditions (78) and (79) are equivalent. 
By (75), we have for any elements x and y: 


(BR x,y) = (Rix, B*y) = fra d(%,z, B* y), 


(80) 
(R,Bz, y) = i d(%, Bz, y). | 


If condition (79) is fulfilled, the right-hand, and therefore the left- 
hand sides of equations (80) are the same, and, since x and y are 
arbitrary, condition (78) is fulfilled. Conversely, if (78) is fulfilled, 
(BZ, x, y) and (Z, Bz, y) can only differ by a constant or at points 
of discontinuity, since the inversion of the Cauchy-Stieltjes inte- 
gral [29] is unique. But both the functions tend to zero as À —> —co 
and are continuous from the right at points of discontinuity, so that 
we have (BZ, x, y) = (Z, Bz, y) for any x and y, i.e. condition (79) 
is fulfilled, and the theorem is proved. 

Notice that, by virtue of the results of [156], Z „ commutes with 
A for any u. This also follows from the last theorem. It also follows 
from this theorem, together with (70), that f(A) commutes with A. 


195. Eigenvalues, As already mentioned, å = 2’ is an eigenvalue 
of A when and only when @, has 4’ as a point of discontinuity, and 
here, x — %,~,) is the projector into the subspace of correspond- 
ing eigenelements (including the zero element). On assuming H 
separable as usual, we can say that the number of eigenvalues, if 
there are such, is finite or denumerable. The rank of an eigenvalue 
is defined as before, and it can be assumed that the set of all eigen- 
elements forms an orthonormal system 2,, tz, ... Let A, be the points 
of discontinuity of %,, Lg the subspaces of corresponding eigenele- 
ments, and P}, = &, — a-o the projectors into these subspaces. 

We form the orthogonal sum 


=L0L,01;@... (81) 


The subspace H’ reduces A, and the operator A’, induced by A 
into H’, is self-conjugate and has a purely point spectrum. 


572 HILBERT SPACE [196 


If the operator already has a purely point spectrum in H, then 


g= È (Fx — En) (82) 

A< 
Let z4, £a -.. be any complete system, orthonormal in H, and 
Hais lao +». a sequence of real numbers, some of which may be the 


same, uy and 2, being described as corresponding. Further, let 
Ay, åp --. be different us, Ly the subspace formed by the xs which 
correspond to the us equal to 2x, and Pz, the projector onto Lr. 

We define the projector 


= > Pi 
MERA 


This is a resolution of the identity, which corresponds to a self- 
conjugate operator C with purely point spectrum. Its eigenvalues 
are A, and £, %, ... is the complete set of eigenvalues. If all the 4, 
belong to a finite interval, C is a bounded operator. 


196. The case of a mixed spectrum. First, some remarks supplem- 
entary to what we said in [191] about the decomposition of a self- 
conjugate operator A into operators with a purely point and a purely 
continuous spectrum. Let some subspace H’ reduce A. It now reduces 
Z, and, on writing A’ and @j for the operators induced by A and 
@, into H’, we can say that &; is a resolution of the identity in H’ 
and 


A'z= | ragi (83) 
(x€ D(A’), 


i.e. Z; is the spectral function of A’. Let A” and 27 be the operators 

induced by A and @, into the subspace H” = H O H’. If r= + 

+ z” and y = y4’ + y” are decompositions of x and y onto H’ and 
H”, where x € D(A), we have [191]: 

Ax — A'x! + A" x”; EY z Ziy’ + ory” 

(Aa, y) = (Ax, y’) + (Aa",y”). (84) 

Similar formulae hold in the case of a finite or infinite number of 

mutually orthogonal subspaces reducing A. We now return to the 

notation of [191] and suppose that H’ is not the whole of H, The 


operator A’ has a purely point spectrum in H’, whilst A” has a purely 
continuous spectrum in H”. Now, 


i= 2 (Fx — Ena) 


iSi 


196] THE CASE OF A MIXED SPECTRUM 573 


whilst the spectral function &% of A”, expressed by 04 = %, — &j, 
has no discontinuities. If A is an unbounded operator, one of the 
operators A’ or A” may be bounded. For instance, if all the points of 
discontinuity of &, lie in a finite interval, A’ is a bounded operator. 
Suppose that A has a purely continuous spectrum, and let C, denote, 
as in [147], the closed linear envelope of elements 2, x. We say that 
A has a simple continuous spectrum if there exists an element x 
such that Cx coincides with H. Here, (254) and (256) of [147] will 
hold, with Hellinger integrals over an infinite interval. If y € D(A), 
(259) and (261) of [147] will also hold. We can regard the correspond- 
ing integrals as improper with an infinite interval of integration.On 
using the inequality [147] 


| Ag, (A) |? < 4e (2) -4 || Fy IÈ, 


we can show as in [192] as regards sums (55), that the infinite sums 
of mutually orthogonal elements, corresponding to integral (261) of 
[147], yield a convergent series by virtue of the fact that y € D(A). 
In the general case of a continuous spectrum, it follows from the 
proof of Theorem 2 of [147] that C, reduces &;, for any 4, i.e. it also 
reduces A. The operator induced by A into C, has a simple continuous 
spectrum, and, as in [147], the operator with a purely continuous 
spectrum can be split into operators with simple continuous spectra 
in mutually orthogonal subspaces, the orthogonal sum of which gives 
the whole of H. In all the formulae we have a sum of Hellinger integrals 
instead of one such integral. A connection can be established, precisely 
as in [152], between C, and LẸ. 

Let A be a self-conjugate operator and U a unitary operator. The 
operator A’ = U AU -~ is defined on the lineal D(A’), obtained by 
applying U to the lineal D(A). Let us show that A’ is self-conjugate. 
Let 

(U AUE, y) = (x, y*) 


for all x of D(A’). We have to show that y € D(A’) and that y* = 
= U AU -! y. The above equation can be rewritten as (Agx’, U -1 y) = 
= (Ux’, y*), where z’ = U-1z is any element of D(A), or as (Az’, 
U-'y) = (x’, U-1y*), whence it follows, since A is self-conjugate, 
that U-1y € D(A) and U-1y* = AU-1y, ie. y € D(A’) and y* = 
= U AU -1 y, which is what we set out to prove. 

Let Z, be the spectral function of operator A. Now 2; = UZ, U ~ 
has all the properties of a resolution of the identity, and || į æ || = 


574 HILBERT SPACE [197 


= || Z U -z ||. I£ x € D(A’), then U -tx € D(A), so that the series 


+00 
2 ABP 
is convergent, and, as in [192], the sum 


foo 

= Ay Ay, Fy 2 
has a limit A’ x = UAU ~ xz, whence it is clear that g: = UZ, U ~! 
is the spectral function of A’. The test of [153] for the unitary equi- 
valence of operators remains in force. 

The concepts of differential solution and of a complete system of 
differential solutions are retained without change. Every differential 
solution x(A) (continuous in the sense of space H) has the form @, 2, 
where x € D(A); it is assumed here that æ(å)=> 0 as A> —œ. 


197. Functions of a self-conjugate operator. If Z, is the spectral 
function of a self-conjugate operator A and /(A) is a bounded function 
in the interval —œ < å < +œ, measurable with respect to all 
non-decreasing functions || %, 2 ||?, then 


+o 


(Az y)= f| FA dig zy) (85) 


-o0 


(ceH; y€H) 


defines, precisely as in [155], a bounded operator f( 4), defined through- 
out H, and having all the properties indicated in [155]. Notice that 
the values of f(A) on a set of measure zero with respect to all the 
|| Z, z ||? have no effect on integral (85), i.e. they have no effect on 
f(A). Let us now generalize the concept of a function of an operator 
f(A) to real functions f(A) with finite values, that are measurable as 
before with respect to all || Z; z ||?, but are unbounded. Let /y(A) 
denote the cut-off function, i.e. the function defined by the equa- 
tions f(A) = f(A) if | f(A) | < N, fx(A) = N if f(A) >N, and fy(A) = 
= —N if f(4) < —N. We can form the bounded operator f(A) for 
the bounded function fy(4). We now have, by (302) of [155], when 
y = v and f,(A) = f,(A): 


+o 
If) fArle = J | frvl@) MA aZe (86) 


197] FUNCTIONS OF A SELF-CONJUGATE OPERATOR 575 


If f(A) belongs to L, with respect to || Z1 2 ||?, the right-hand side 
tends to zero as M and N —> +, since | fy(A)| < | f(A) | and 
fn(A) > f(A) almost everywhere with respect to || @,2||*, i.e. the 
sequence fy(A)x is mutually convergent, and a limiting element 
exists, which we write as f{(A)z, i.e. fy(A)u => f(A)z as N>-+0, if 

+o 


S PAA Ea? < + 2. (87) 


It is natural to write D[f(A)] for the set of elements æ that satisfy 
this condition. Some relevant results will be mentioned, whilst omit- 
ting the proofs. 

The lineal D{/(A)] is everywhere dense in H, f(A) is a self-con- 
jugate operator, and 


(KA) x,y) = Tr )d(F, 2, y) (88) 
(x€D{f(A)], ye H). 


In Per case of a rae function f(A) = f,(4) + få) i, we 
take f(A) = f,(A) + ie i, and the lineal D[f(A)] is defined as 
before condition (87 krn fA) replaced by | f(A) |?. The fol- 
lowing proposition holds, ae as in [156]: the necessary and sufficient 
condition for an operator to be a function of a self-conjugate operator 
A is that it be closed and commute with any bounded operator that 
commutes with A. 

Let us turn to the question of the commutation of general self- 
conjugate operators. On recalling the theorem of [156], the following 
definition is naturally arrived at: two self-conjugate operators A and 
B are said to commute if their spectral functions @, and F,, (bounded 
operators) commute for any å and p. By the theorem mentioned, 
this definition is equivalent to the ordinary one if A and B are bounded. 
Jf A is non-bounded and B is bounded, we have the definition of 
commutation of [191]. It may easily be seen to coincide with the one 
just given, if A and B are self-conjugate. For, by the theorem of 
[191], commutation in the previous sense is equivalent to the fact 
that B commutes with &, for any A, whilst this is equivalent to the 
fact [143] that, given any u, F, commutes with all &,, i.e. we arrive 
at our new definition of commutation. 

By starting from the new definition of commutation of self-con- 
jugate operators, it can be shown that real functions of the same self- 


576 HILBERT SPACE [197 


conjugate operator A commute and that, if the self-conjugate opera- 
tors A,, A,, ... commute in pairs, they are all functions of the same 
operator A (cf. [156]). 

Let us consider sums and products of unbounded operators. The 
operator (A + B) x = Ax + Bx is defined for elements x that belong 
simultaneously to D(A) and D(B). The operator (AB) x = A(Bz) is 
defined for x such that x € D(B) and Bx € D(A). If a is any complex 
number, the operator (aA) x = a(Az) is defined on D(A). Let A 
and B be self-conjugate commuting operators, where B is bounded 
and defined on the whole of H. The operator A Bz is now defined on 
the lineal ’ of x such that Bx € D(A). In accordance with the defini- 
tion of commutation, if x € D(A), then Bx € D(A), i.e. D(A) belongs 
to V, though /’ may be a wider lineal than D(A). Let us show that 
AB is self-conjugate on V. Let (A Bz, y) = (x, y*) for x € I’, and all 
the more for x € D(A). We have to show that y € I’ and that y* = 
= ABy. On assuming that x € D(A), we can replace the equation 
in question by (BAz, y) = (x, y*) for x € D(A), or, since B is bounded 
and self-conjugate, (Az, By) = (x, y*) for x € D(A); hence, since A 
is self-conjugate, we have By € D(A) and y* = ABy, which is what 
we set out to prove. Notice that, if Aand Bare unbounded commuting 
self-conjugate operators, the operator AB may not be self-conjugate, 
although its conjugate (A B)* is always self-conjugate. 

Let us apply the definitions of sum and product to the power of 
an operator A. The lineal D(A?) consists of the x such that x € D(A) 
and Ax € D(A), i.e. D(A?) belongs to D(A) and may be narrower 
than D(A). Similarly, D(A*) consists of the x such that x € D(A?) 
and A*z€ D(A), so that D(4?) belongs to D(A’). A polynomial of 
the form a, A + a,4? +... + an4" is obviously defined on the lineal 
D(A"). It can be shown that this polynomial coincides with the func- 
tion of operator A defined above, if we take f(A) = a, +a,A+...+ 
+ a, 4", and that the set of elements on which all the polynomials 
are defined is a lineal dense in H. 

If the self-conjugate operator A is positive, i.e. the lower bound 
of the spectrum ma > 0, we can form as in [143] the positive self- 
conjugate operator A’, the square of which is equal to A : 


1 +00 
A= f \rdé,. 


It may easily be seen that only one positive self-conjugate operator 
B exists, the square of which is A. For, let Z; be a resolution of 


198] SMALL PERTURBATIONS OF THE SPECTRUM 577 


the identity of B. We must have 


+o +o 
B= Í udg, and A=B?=f{ wag, 
0 0 
or 
+70 


A=B = Í iMéy. 


0 


The family of operators % Yi depending on the parameter À, is 
a resolution of the identity, and in view of the uniqueness of the spec- 
tral function Zy; = %;, it follows that B must be conjugate to Al?, 


198. Small perturbations of the spectrum. Let us consider the variation 
in the spectrum of a self-conjugate operator when another self-conjugate 
operator is added to it. Notice first of all that Theorem 1 of [157] holds for un- 
bounded self-conjugate operators. Two further theorems must be proved. 

Let L be a subspace. Its dimensionality r is the number of elements of an 
orthonormal system complete in L. This number r may be finite or infinite. 
It is easily shown to be independent of the choice of complete orthonormal 
system in L. 

Lemma 1. Let L, and L, be two subspaces of r, and r, dimensions. If t, < 12, 
there ia a non-zero element in L, which is orthogonal to all the elements of Ly. 

Notice that, since H is separable, the number 7, is finite, whilst r, may be 
either finite or infinite. Let us use reductio ad absurdum. Suppose that there is 
no non-zero element in Z, which is orthogonal to L,. Let £i, £a ..., z, be a 
complete orthonormal system in L, (base in L,) and P a projector onto sub- 
space L,. We have for any element v € L,: 


(v, Pay) = (Pv, xy) = (v, Ly) - 


The elements Px; of L, define some subspace L} of dimensions r, < r, (the 
sign is < if the Pz, are linearly dependent). Let us show that L, must be the 
same as L,. If this were not the case, there would exist an element y € Ly, 
different from zero and orthogonal to L} and we should have (y, Px,) = 0, 
ie. by the above equation, (y, x) = 0 (k = 1,2,...,7,), ie. y is orthogonal 
to L,. But there is no such element y, by hypothesis. We have shown that L, 
coincides with L,, i.e. r, = r} But f< 7, as we have seen, so that r, < r, 
which contradicts the condition in the lemma. 

Lemma 2. Let A be a self-conjugate operator (unbounded or bounded), B a 
bounded self-conjugate operator, €, the spectral function of A and &) the spectral 
function of A’= A+ B. Further, let A be some finite interval [a,b], and A B,a 
the interval [a — || B\| — ¢,b+ || B}| + €], where e is any given positive 
number. The number of dimensions of subspace L, = (4. 64) x(x € H) is now 
not less than the number of dimensions of subspace L, = (AE,) x(x € H). 


578 HILBERT SPACE [198 


Notice first of all that we are using the notation of [141] (e.g. 4¢, = ¢, — 
— &,). We use reductio ad absurdum. Let L, have at least as many dimensions 
as L,. By Lemma 1, there exists an element y € L}, different from zero and ortho- 
gonal to L,. We can assume that ||y|| = 1. On writing a = (a + b)/2 for 
simplicity, and noting that y € L, we obtain 





| 4y—ey t= f a-orin (232) 





and 
b—a 
| A’ y — ay || < |] Ay — ay || + || By |] < ~ + IB |l (89) 
On the other hand, since y | L, we have 
Le a—||Bi|—e +e 
Ay —ay|P= f (A—a)} da(ë; y, y) = j 
Za b+|iBil+e 


whence we obtain, on again using the orthogonality of y to L,: 


+ co 
lay- oll > (F5 +B.) f aun, 





1.6. 
b—a 


z- + Bll+e. (90) 





\|A’y — ay || > 


The contradiction of inequalities (89) and (90) proves the lemma. 

THEOREM 1. Let the spectrum of the operator A inside the interval A consist 
of a finite number of eigenvalues, the sum of the multiplicities of which is equal to 
k, where k is a finite number, and let the distance of the remaining part of the spec- 
trum of A from A be greater than 2 || B ||. 

The spectrum of A’ in the interval Ap, o = [a — || B ||, 6 + || B ||] now con- 
sists of eigenvalues, the sum of the multiplicities of which is equal to k. 

By Lemma 2, given any e> 0, the number of dimensions of L, is > k. If 
the > sign holds, we can say, on again using Lemma 2 and the equation A = 
= A’ — B, that the subspace 4), 2s 6, « (x €H) has more than k dimensions. 
But by hypothesis, this cannot be true for all e sufficiently close to zero, ie. La 
has k dimensions for such e€, whence the theorem follows. 

Note. Application of the theorem with k= 1 gives us the possibility 
of watching the variation of an isolated simple eigenvalue, in the case of small 
perturbations. 

THEOREM 2. If there is at least one point of condensation of the spectrum of 
an operator A inside an interval A, there must be at least one point of condensation 
of the spectrum of A’ inside the interval Ag p 

In this case k = œ, and the theorem follows from Lemma 2. 

Note. If å, is a point of condensation of the spectrum of A, there is at 
least one point of condensation of the spectrum of A’ in the interval [A, — 


— Bll, žo + || Bil. 


199] THE OPERATOR OF MULTIPLICATION 579 


199, The operator of multiplication. Let us take space L, on the 
interval (— œ, +-°°) and the operator of multiplication by the inde- 
pendent variable: 

Af(x) = af(x) . (91) 

The lineal D(A) consists of functions f(z) of L, such that af(x) € La 
also, and in particular, every function differing from zero only on 
a finite interval belongs to D(A), whence it follows that the lineal 
D(A) is everywhere dense in H. Let us show that A is self-conjugate. 
We have to show that, if (Az, y) = (x, y*) for x € D(A), then y € D(A) 
and y* = Ay. The condition implies in the present case that 

+20 +% 


Safle) pla) de = f fx) p*(x) de (92) 
for all f(x) of D(A) and certain g(x) and ¢*(zx) of L, and we have 
to show that g(x) € D(A) and g*(x) = p(x). We apply (92) to f(x) 
of Z,, where f(x) differs from zero on some finite interval (—a, +a). 
Notice that such a function belongs to D(A): 


+a 


J He) [e*(@) epa] dx = 0. 


It follows immediately [52] that, since f(x) is arbitrary, o*(x) — 
— xp(x) is equivalent to zero in the interval (—a, +a), and hence 
in (—°°, +20), since a is arbitrary, i.e. we can take y*(x) — xy(z) = 
= 0. But g*(x) € L,, so that xg(x) € L, also and o*(z) = zy(z), 
which is what we set out to prove. The spectral function of operator 
(192) is defined as in [152], and is given by 


f(x) for «<A, 


Bie) =| 0 for z>A, (93) 


and the operator has a purely continuous spectrum, distributed over 
the interval —co < å < -+oo, 

Operator (91) is obviously unbounded. Notice also that every 
function f(x) of D(A) is summable over (—œ, +). For, on putting 
xf(x) = w(x), where w(x) € L, we can write f(x) = (1/x) w(x); since 
1/e and w(x) belong to L, in any interval (—œ, —a) and (a, °°), 
where a > 0, f(x) must be summable in (—œ, +00), 

Precisely as in [152], we can consider an operator of multiplication 
by a function, which we shall assume to be real and bounded: 


A’ f(x) = w(x) f(x) , 


580 HILBERT SPACE [199 


an unbounded operator being obtained in the case of unbounded 
w(x). Suppose for definiteness that w(z) is bounded throughout 
(—œ, +), except for the neighbourhoods, however small, of a finite 
number of points. On taking any closed interval that does not contain 
these points, and arguing as above, (94) is seen to be a self-conjugate 
operator for f(x) € L, such that w(x) f(x) € L, also. Its spectral func- 
tion is given, as in [152], by 


f f(x) for a(x) <A, 
Fifa) -| 0 for w(x) <A. 


If say w(x) € L, the lineal D(A) contains all bounded functions 
of L, and A’ is also a self-conjugate operator. Notice that the operator 
A’ can be regarded as a function «(A) of the operator A of multiplica- 
tion by the independent variable for the class of functions (zx) 
indicated in [197]. 

Let B denote the self-conjugate operator 


BO(a) = i TEC — igt), (96) 


(95) 





defined in H = L{—œ, +œ) on the set D(B) of functions ®(x), 
absolutely continuous in any finite interval and having a derivative 
in L(— œ, +2) [188]. On using Fourier’s transformation, the lineals 
D(A) and D(B) may be formed. 


On writing 
+N +N 
Pya) = = Í D(t) et dé and y(t) = f ip(t) e®tdt, (97) 
-N -N 


where ®(t) is any function of D(B), and integrating by parts, we get 
P pla) — py) = (O(N) eN — G(— N)e-N], (98) 
V2x i 


where the right-hand side tends uniformly with respect to x in the 
infinite interval to zero as N —> œ [188]. The functions y,(z) and 
W(x) tend in the mean in the infinite interval to functions »(z) 
and ¥(zx) of L [178]. This will be all the more true in any finite interval 
[—a, +a], and furthermore, tP n(x) will tend in the mean to F(x) 
in such an interval. It follows from (98) that, given any positive € 
and sufficiently large N, we have 


+a 
f (aP p(z) — pyl)? da <e, 


=i 


199] THE OPERATOR OF MULTIPLICATION 58} 


or, since we can pass to the limit under the norm sign, 


+a 

f |2¥(x) — ylz) Pde < e, 
whence, since « and a are arbitrary, 

“foo 


f |x¥(x) — p(x) Pde =0, 


i.e. c¥(x) = y(x), or 





+o +% 
1 
E @(t) e*t dt = —— f ; ixt 
o Í (t) edt =- f ag(tyemat, (99) 
which can be written as 
T*(D) =~ T*(ip) =f, (100) 


and, since T*(ip) € L,, we see that af(x) € L, as well as f(z) € Lı 
ie. f(x) € D(A) 

Let us now show that, if f(x) is any function of D(A), T(f) € D(B). 
Since f(x) and 2f(z) belong to L,, we can form T(f) and T (af). On taking 
(100) into account, we can introduce the notation 


+00 
O(2) = Tif) = f f(tje-**dt, 





=% 


(101) 


vz J 


-0 


{ 
ine) = Pap) = f “aft en dt. | 
J 


We have [178] 


+o 


N 
a sak E er 
poteg hdr e-iNt) dé. 


—0 


If we make use of the first of formulae (101), from which it follows, 
since f(t) is summable in (—œ, +o), that 
ee 
(0) = —— t) dé 
(= Te f Ko 


we can rewrite the above formula as 


N 
i | pt) dt = +80) — HN), 
0 


582 HILBERT SPACE [199 


whence it follows that ®(x) € D(B) and ip(x) = B(x). We saw 
above that T* transforms any element ® of D(B) into an element 
f of D(A), and we have just shown that T transforms any f of D(A) 
into ® of D(B). A one-to-one correspondence is thus established 
between D(A) and D(B), where T transforms D(A) into D(B) and 
a passage from f(x) to 2f(x) before the transformation corresponds 
to a passage from @(zx) to BO(x) = ip(x) after transformation, i.e. 


B=TAT*, (102) 


which leads to the following theorem. 

THEOREM. The self-conjugate operators A and B are unitary equi- 
valents and (102) holds, where T is the Fourier transformation. 

We have &; = T@,T* for the spectral function Z; of the operator 
B, where @, is given by (93), i.e. 


pe ea 
i f(a) = TE f | fro emat| e- dy, (103) 
or oo 
i a 
Bi fla) = 7 f T* (f) e7» dy. (104) 
We can write D(z) as a 
(x) = f plat = — F pear, (105) 


where it cannot be asserted that g(t) is absolutely integrable in the 
infinite interval, and the integrals written have to be understood as 
the limits of integrals over a finite interval as this is extended. The 
self-conjugate operator B-1, which is obviously defined by B- pọ = 
= —79, is not given in the whole of L,; it is given only for g(x) such 
that (x), defined by (105), also belongs to L,. This is due to the 
fact that A’ = 0 belongs to the continuous spectrum of B. 

If we consider operator (94), the operator TA’ T* will be a function 
w( B) of the operator B, so that 


+% 


o(B) jix) = zy [ ay) [fre eb dt| e=» dy. (106) 


=æ —oo 


If w(x) is an unbounded function, the lineal on which this operator 
is defined is obtained from D(A’) with the aid of the operator T. 


200] INTEGRAL OPERATORS 583 


Remember that the integrals written, with infinite limits, have to 
be understood in the sense of a convergence in the mean. Like A, B 
has a simple continuous spectrum. The function w(x) appearing in 
(106) must evidently satisfy the conditions that we formulated for 
f(A) in [197]. 


200. Integral operators. We take the integral operator in the 
interval [a,b] with kernel K(x, y), satisfying 


Ky, x) = K(x, y) (107) 
and such that 


b 
K? (x) = f |K(x, 9) dy < + œ (K(x) > 0) (108) 
for almost all x of [a, b]. The corresponding operator 
b 
plz) = Kj(x) = f K(x, y) fly) dy (109) 


is defined on the lineal D(K) of f(x) of L, in [a, b] such that p(x), 
defined by (109), also belongs to L}. Let us also consider, as in [173], 
the lineal 7 of functions f(x) of L, such that 


b 
{ K(a) |f(a)| da < + œ (110) 


We have seen that the lineal / is everywhere dense in L, [173]. 
Let us show that, if f(x) € l, it belongs all the more to D(K). 

It will follow from this that D(K) is everywhere dense in Z,. Let 
f(z) € l. We can write 


b b 
p(z)? = f K(x, y) fy) dy- f K(x, t) f(t) dt, 


a a 


so that 
b l bbb 
f pæ de < f f f| Eœ y) |E, t) |[ fy |[f@|dydédx, (111) 


and it is sufficient to verify that the integral on the right has a finite 
value no matter what the order of integration. By Buniakowski’s 
inequality: 


b b b 1 
J [K(z, y)| |E(x, Ol dz < [ f K(x, y)? dz- f |K(x, t)? da]? = 


a u 


= K(y) K($, 


584 HILBERT SPACE [200 


and the right-hand side of (111) does not exceed the product 
b b 
{ Ky) [F0 dy- f K(t) 6) | de, 


which has a finite value, since f(x) € l. Let Af(x) denote the operator 
defined by (109) on the lineal 7. It may easily be shown that this is 
a symmetric operator, i.e. 


b b o b b ar, 
JL) Ey) fy) dy] oe) de = f [f Kie, y) oe) de] fly) dy, (112) 
if f(x) and w(x) belong to l, where, by (107), 


b b G a 
f K(x,y) o(a) dx = f Kly, æ) w(x) dæ 


a 


obviously belongs to L, as a function of y. To prove (112), it is suf- 
ficient to verify that 


b b 
J J 1E@ y) || fa) || ot |dy de (113) 
is finite whatever the order of integration. If we integrate first with 


respect to x and use Buniakowski’s inequality, expression (113) is 
seen to be not greater than 


b 
leli f Ky) [fy lay, 
a 
and this quantity is finite, since f(y) € l. 
The symmetric operator A, defined by (109) on the lineal /, is by 
no means always self-conjugate; but it has a conjugate A*. Let us 
show that A* is the same as the operator K, which is defined by the 


same formula (109) on the lineal D(K) of f(z) € L, such that g(x) € La 
also. Suppose that, for all f(x) € J, 


b b aE b apes: 
$ [$ Kiz, y) fy) dy] o(a) dz = J fx) o*(a) de , (114) 
where w(x) and w*(xz) € Z,. We have to show that 
b 
w*(x) = Í K(x, y) oly) dy, (115) 


whence it will follow that w(x) € D(K) also. When proving that 


200] INTEGRAL OPERATORS 585 


integral (113) is finite, we only made use of the fact that functions 
f(x) belong to 1. We can therefore change the order in the integral on 
the left-hand side of (114), and this formula can be rewritten as 





è b 
f f(x) [o*(æ) — f Kie, y) oly) dy] da =0. (116) 


The proof of the theorem of [173] leads us at once to the fact that 
the difference in square brackets must vanish, and we get (115). 
Conversely, if w(z) € D(K) and w*(x) is given by (115), it follows 
at once from the above working that (114) holds. The above discussion 
yields the theorem: 

THEOREM. Let K(y, x) = K (x, y) and condition (108) be fulfilled. 
Let l be a lineal of functions f(x) € L, in the interval [a, b], satisfying 
condition (110), and D(K) the lineal of functions f(x) € L, such that 
g(x), defined by (109), belongs to L,. The lineal lis now everywhere dense 
in L, and belongs to the lineal D(K). If, in addition, A is the operator 
defined by (109) on the lineal l, and K is the operator defined by the same 
formula on D(K), A is self-conjugate and A* = K. 

A necessary condition for K to be self-conjugate is that it be sym- 
metric, which reduces, by (107), to the equation 


b b Te. b b PES 
J [ J Ky, £) f(z) dx] wy) dy = J [$ K(y, x) oly) dy] f(x) dx, (117) 


which must be satisfied for all f(x) and w(x) of D(K). Let us show that 
this condition is also sufficient for K to be self-conjugate. In fact, 


let 
b 


b b 
J LS Kie, y) fy) dy] ola) de = J f(a) oade, 
for all f(x) of D(K), where w(x) and w*(x) belong to L}. We have to 
show that w(x) € D(K) and (115) holds, where, in view of the definition 
of D(K), it is sufficient to prove (115). On subtracting (117) term by 
term from the last equation and noting (107), we get (116), whence 
(115) follows, in view of the arbitrary choice of f(z) of D(K), which 
contains 7. Thus the necessary and sufficient condition for an opera- 
tor K to be self-conjugate is that (117) be satisfied for any f(x) and 
(x) of D(K). 


Let us mention some simple examples of self-conjugate operators in the case 
of a kernel dependent on a difference in the infinite interval (— co, + oo). 
Let g(t) be a real even function of L, in this interval and f(z) any function of 


586 HILBERT SPACE {202 


L,. Let us write G(t) = T*(g) and F(t) = 7*(f). The function G(t) is real, since 
g(t) is a real even function. We can write [178]: 
1. 1r 
= ct dy = —— G(t) F(t e` dt, 118 
la) == f oe- vi u= T= f ae) PO (118) 


= oo 


where, since G(t) € L, and F(t) € L, the product G(t) F(t) is summable in 
(— œ, + œ). Let l’ be the lineal of functions F(t) of L, such that G(t) F(t) € La. 
The right-hand side of (118) is T(@F) on the lineal 1’, so that p(x) must belong 
to L, On turning our attention to the middle part of (118), we can thus say 
that, on the lineal l, which is got from l’ with the aid of transformation T, 
the integral operator K with kernel g(x — y) is the unitary equivalent to the 
operator of multiplication by the functions G(t) of Z,, and is therefore a, self- 
conjugate operator. Remember that D(K) denotes the lineal of f(x) of L, such 
that p(x), defined by (118), also belongs to L,. It follows from the above argu- 
ments only that l; belongs to D(K). It can be shown that l coincides with 
(K). This assertion is obviously equivalent to the following: if p(x) € L, in 
(118), then G(t) F(t) € L, The coincidence with D(K) follows immediately 
from the fact that it is impossible to extend a self-conjugate operator so as 
to again obtain a self-conjugate operator. We proved this in [187]. Hence, 
if g(t) is a real even function of L,, the integral operator with kernel g(x — y) 
is a self-conjugate operator in D(K). 


201. The extension of a closed symmetric operator, We shall 
assume in future that A is a closed symmetric operator. The next 
two theorems are fundamental for what follows. 

THEOREM 1. The lineal D(A) of elements x is mapped one-to-one 
in accordance with the formulae 


y = (A + iB) x, (119) 
z= (A —iE)z, (120) 
onto subspaces L(A) and L_,(A), where, if y and z are elements of these 


subspaces, corresponding to the same x of D(A), the distributive operator 
U, transforming y into z: 


z= Uy, (121) 


maps L;(A) one-to-one onto L_,( A) whilst preserving the norm and the 
scalar product: 


Uyl = llul; (UY Uya) = (Yr Y2)- (122) 
We have 


|\(A + iE) a||? = ((A + iE) z, (A + iE) £) = 
= (Az, Ax) + ì(x, Ax) — (Ax, x) + (2,2), 


201] THE EXTENSION OF A CLOSED SYMMETRIC OPERATOR 587 


whence it follows, since A is symmetric, that 
||(4 + iE) a|P = [Az]? + |] el? (123) 
(€D (4A)). 

Let us show that, given distinct x, (119) yields distinct y. If distinct 
z and 2, € D(A) were to give the same y, their difference x = 2, — £z 
would give y = 0, and we have to show that (A + iE) x = 0 implies 
x = 0. But this is an immediate consequence of (123). Thus (119) 
maps D(A) one-to-one onto a lineal L;(4). Let us show that it is 
closed, i.e. that it is a subspace. On using (123), we get || (4 + 
+ ie) a || < || a ||, whence it follows that (A + ¢£)-} is bounded. 
But we proved in [184] that, if an operator B is closed, and the bounded 
operator B-! exists on R( B), R(B) must be a subspace. In other words, 
L(A) must be a subspace. Similarly, it follows from the equation 


||(A —- iB) x|? = || Aa? + llel. (123;) 


analogous to (123), that (120) maps D(A) one-to-one onto a sub- 
space D_,(A). On taking y and z, corresponding to the same 2, 
we get the fully defined distributive transformation (122), mapp- 
ing L(A) one-to-one into D_A), where (123) and (123,) may be 
written as || y ||? = || Ax |? + || æ ||? and || 2 |? = [| Ax |? + || æ |), 
ie. [| 2 || = || y || or || Uy || = || y ||. The proof of the second of for- 
mulae (122) is thus precisely the same as for unitary operators [137], 
and the theorem is therefore proved. 

Notice that, if A is self-conjugate, in view of the fact that +t 
are regular points of A, L;,(A) and LZ_,A) coincide with H [189] 
and U is an unitary transformation. If both, or at any rate one, of 
the subspaces does not coincide with H, U is generally called an iso- 
metric operator, i.e. a distributive operator U, defined on a subspace 
L’ and mapping it one-to-one onto another subspace L” whilst 
preserving the norm (and hence the scalar product) is known as an 
isometric operator. The inverse U-1!, mapping L” onto L’, is obvi- 
ously also an isometric operator. If L’ and L” coincide with H, U is 
a unitary operator defined throughout H. The formulae y = Az + ix 
and Uy = Ax — ix map D(A) one-to-one onto L(A) and L_,(A), 
and lead to the formulae 

c= 


(y — Uy); 


E 

2i 

1 
At==> 


(y+ Uy), 


588 HILBERT SPACE [201 


the first of which maps L,(A) one-to-one onto D(A). If we replace 

y by 2iy, which leads to the same lineal L;( 4) of elements y, we ob- 
tain the simpler formulae 

x= y — Uy; (124) 

Az = ily + Uy), (125) 


the first of which maps Z,{(A) onto D(A) one-to-one as before, 
whilst the second gives the corresponding element Az. Notice that the 
lineal D(A), defined by (124), is dense in H. The isometric operator 
U is known as a Helly transformation of the closed symmetric opera- 
tor A. Let us now prove the converse, in the accepted sense, of the 
previous theorem. 

THEOREM 2. If U is an isometric operator, mapping the subspace L’ 
onto the subspace L”, and formula (124) for y belonging to L’ defines 
a lineal l dense in H, then (125) defines a closed symmetric operator 
A on l, where U is the Helly transformation of A, whilst L’ and L” 
coincide with L;(A) and L_,(A). 

We must show first of all that, given distinct y of L’, (124) yields 
distinct 2, i.e. as above, we must show that Yọ — Uy) = 0 implies 
Yo = 0. We form the scalar product (Yo x). If we can show that it 
vanishes for any 2 of l, we can assert that yọ = 0, since this lineal is 
dense in H. Thus 


(Yo. £) = (Yo. Y — Uy) = (Yo, Y) — (Yo UY), 


or, since U is isometric: 
(Yo, £) = (UY, Vy) — (Yo, Uy) = (UY — yo, Uy) = (0, Uy) = 9, 


which we wanted to prove. Given any x € 1, we obtain in accordance 
with (124) a definite y € L’, and in accordance with (125), a definite 
Az. A distributive operator A is thus obtained. Let x’ and x” be two 
elements of J, and y’, y” the corresponding elements of L’. Using 
the fact that U is isometric, we obtain: 


(Ax, 2") = (ily + Uy’), y” — Uy") = 
= iy’, y") + (Uy, y”) — uy’, Uy’) — Uy’, Uy") = 
= Uy’, y”) —ily’, Uy"). 
We get the same result on expanding (x’, Az”) = (y’ — Uy’, 
ily” + Uy")), ie. (Ax’, z") = (2’, Ax”), so that A is symmetric. Let 
us show that A is closed. Let z, € l be such that 


z =x and Az, => w. (126) 


201) THE EXTENSION OF A OLOSED SYMMETRIC OPERATOR 589 


We have to show that x € land w = Az. Let yn denote the elements 
of L’ corresponding to Zp, i.e. 


En = Yn — Uyni At, = ty, + Cyn); (127) 


it follows from these equations that yn = (1/27) (Ag, + izn) > 
= (1/2i) (w + ix). On writing this limit as y for brevity, we can 
say that y € L’, since L’ is a subspace, and that Uy, > Uy, since 
U is an isometric operator, and hence || U(y — Yn) || = || y — Yn Il. 
On passing to the limit in (127), we get x = y — Uy and w = ily + 
+ Uy), where y € L’, ie. x € L and w = Ax, which we wanted to 
prove. Finally, replacing x by 2ix in (124) and (125) gives 


y=(A+iE)x and Uy = (4 — iE)zx 
(xel), 


whence it follows at once that U is a Helly transformation of A, and 
that L’ and L” are L,(A) and L;(4); the theorem is proved. 

The above theorems throw light on the possible extensions of a 
closed symmetric operator A. Let B be a closed symmetric extension 
of A (not coinciding with 4). The right-hand sides of the equa- 
tions 

y = (B + iE) zx; z= (B — iE) z 


are defined on the lineal D(B), wider than D(A), whilst they give the 
same result for elements 2 belonging to D(A) as the right-hand sides 
of (119) and (120). The subspaces L;(B) and L_,(B) are therefore 
strictly wider than subspaces L,(A) and L_, A), and if we write V 
for the Helly transformation of the operator B, we can say that 
V transforms L;(B) into L_,B), and coincides with U on L,(A), i.e. 
the isometric operator V is an extension of the isometric operator U. 
On using Theorem 2, we can say that, conversely, any extension of 
an isometric operator U, leading to another isometric operator V, 
yields, in accordance with 


xz =y — Vy; Be=ty + Vy) (128) 
(y€ D(V)), 


a closed symmetric operator B which is an extension of A. By what 
has been proved, it is only in this way that extensions of A can be 
obtained that are closed symmetric operators. 

If A is a self-conjugate operator, L;(4) and L_,(A) are the whole 
of H, and extension is impossible. 


590 HILBERT SPACE [202 


202. Deficiency indices. Let (A) and M_,(A) denote the subspaces 
complementary to L;(A) and L_;(A), and let p and q be the numbers 
of dimensions of the former subspaces. If say L,(A) is the whole 
of H, M;(A) is absent, and we take p = 0. 

If M,(A) is finite-dimensional, p is the number of its dimensions, 
whilst p= œ if M,(A) is infinite-dimensional. The number pair 
(p, q) defines the so-called deficiency indices of the operator A. Let 
us prove a number of simple theorems on deficiency indices. 

THEOREM 1. The necessary and sufficient condition for a symmetric 
closed operator A to be self-conjugate is that both its deficiency indices 
be zero. 

If A is self-conjugate, (119) and (120) transform D(A) into H, as 
we know from [201], so that p = q = 0. Suppose conversely that 
p = q = 0. Now, L;(4) and L_;,( A) coincide with H, and U is a unitary 
operator (defined in the whole of H, like U-1). Let (Az, v) = (a, v*) 
for all x of D(A). We have to show that v € D(A) and v* = Av. By 
(124) and (125), the previous equation can be rewritten as 


ay, v) + (Uy, v) = (y, o*) — (Uy, o*), 
whence, since U is unitary, 
(y, v* — U o*t + iv + iU-1v) =0, 


and we have, since y is arbitrary: v* + iv = U-Y(o* — iv). On writing 
v* + iv = iy’, we have v* — iv = 2tUy’, whence v = y’ — Uy’ and 
o* = iy’ + Uy’), ie. v € D(A) and v* = Av; the theorem is proved. 
The sufficiency is also a consequence of Theorem 2 of [187]. 

Before proving the next theorems, the structure of isometric opera- 
tors must be explained. It can be shown, precisely as for unitary 
operators [137], that an isometric transformation U of the subspace 
L’ into the subspace L” amounts to a transformation of the complete 
orthonormal system z, £a ... in L’ into the same system Yı, Y%, ..- 
in L”, so that Ux, = y, (H is assumed separable) and 


U È ay ty = È Ie 


Here, obviously, either both subspaces are infinite-dimensional or 
both have the same number of dimensions. If this condition is fulfilled, 
in view of the arbitrary choice of base vectors, we can form an infinite 
set of isometric mappings of L’ onto L”. If we have an isometric 
operator U, mapping Z,(A) onto Z_,(A), we can only widen it by 
the addition of the same number of new base vectors from M;(A) 


202] DEFICIENCY INDICES 591 


and M_,A), and by establishing a correspondence between them. 
The following general theorem is an immediate consenquence of these 
remarks. 

THEOREM 2. The necessary and sufficient condition for a closed 
symmetric operator A to be extensible whilst preserving its symmetry 
is that both the deficiency indices of A be non-zero. If this condition is 
fulfilled, an infinite set of extensions exists. The necessary and sufficient 
condition for A to be extensible as far as a self-conjugate operator is that 
the deficiency indices of A be the same (non-zero), and if this is the 
case, an infinite set of such extensions exists. 

A general scheme can be described for extending an operator 
A. Having extracted from M;(A) and M_,(A) any given subspaces 
N; and N; with the same number of dimensions, we form some 
isometric operator V, mapping N; onto N_; We define for the 
extended operator B the subspace L,(B) as the orthogonal sum 
LA) ®© N; = DB), so that every element y of L;(B) is uniquely 
expressible as y = y’ +y", where y’ € L(A) and y"€N;. The extend- 
ed isometric operator V is defined by Vy = Uy’ + Vy”, the right- 
hand side of which is a decomposition of Vy, belonging to L_,A) © 
® N; onto the orthogonal subspaces L.A) and N. By (124), 
the lineal D(B) is defined by æ = y’ + y” — Uy’ — Vy" = (y’ — 
— Uy’) + (y” — Vy"), where (y’ — Uy’) is any element of D(A) and 
y” is any element of N;. We can write this fact as 


Lg = Za F Ey; — VEy,. (129) 
Similarly, (125) gives Br = i(y’ + Uy’) + ty” + Wy", ie. 
Bag = Ax, + tty, + iV fy. (130) 


If N; and N; coincide with M;(A), the last formulae define a self- 
conjugate extension of A. It is easily shown that the expression 
for xg as the sum (129) is unique. In other words, we have to show 
that, if sum (129) is equal to the zero element, all the terms are 
equal to the zero element. In fact, if zg = 0, then Brg = 0, and 
(129) and (130) give xa + £n, — Vín, =0 and Az, + ity, + 
iVzy, = 0. On multiplying the first equation by 7 and adding to 
the second, we get (A + iE) za + 2tzy, = 0; the first term in this 
sum belongs to L;(4), whilst the second is orthogonal to L,(A), whence 
it follows that they are both zero, i.e. zy, = 0, so that Vay, = 0 
and xa = 0, which is what we set out to prove. 


592 HILBERT SPACE [202 


If one of the deficiency indices is zero, whilst the other is non-zero, 
A has no closed symmetric extensions; such an operator is described 
as maximal. A self-conjugate operator, i.e. an operator for which 
both deficiency indices vanish, is alternatively described as hyper- 
maximal. Suppose that A has deficiency indices (1, 1), ie. that the 
subspaces Mf;(A) and M_,(A) are one-dimensional, and let v) and wy 
be any given elements of them, where || vo ||=|| wo || # 0, so that all 
their elements are expressible as v = avo, W = aWy, where a is any 
complex number. The formula V(av,) =e’ aw, where 0 is any 
given real number from the interval 0 < 0 < 2x, gives an iso- 
metric mapping of M;(A) into M_,(A), and, on adding this trans- 
formation to the U that maps Z;(A) onto L_,(A), we get a unitary 
operator V; (128) defines a self-conjugate operator B, which depends 
on the choice of the above-mentioned 6. If the deficiency indices of A 
are (2, 2), we obtain on choosing any two mutually orthogonal nor- 
malized elements 2, v, of M;(A) and similarly w,, w, of M (4), 


If we fix v and choose w, by all possible methods, we get all the dif- 
ferent V. On extending the isometric transformation U as far as the 
unitary V, a self-conjugate operator A is again obtained in accordance 
with (128). 

Let us prove a further theorem, giving a new characteristic of 
subspaces M;(4) and M_,(A). 

THEOREM 3. M,(A) is the subspace of eigenelements of the operator 
A* corresponding to the eigenvalue 2 = i, i.e. the subspace of solutions 
of the equation A* x = ix, whilst M_,(A) is the subspace of solutions 
of A*¥x = — iz. 

Notice that, since A* is a closed operator, the lineal of its eigen- 
elements corresponding to a given eigenvalue is always closed, i.e. 
is a subspace. The elements v of the subspace M;( A) are characterized 
by the fact that they are orthogonal to (A + iE) x for any x of D(A), 
ie. they are characterized by the equation ((A + iE) z,v) = 0, 
which may be rewritten as (Az, v) = (x, iv), where x is any element 
of D(A). In view of the definition of A*, the last equation is equivalent 
to the fact that v € D(A*) and A* v = iv, and the assertion of the 
theorem regarding M;(A) is proved. The assertion regarding M_,(A) 
is proved in the same way. It follows from this theorem that the 
existence of non-zero deficiency indices is bound up with the fact that 


203] THE CONJUGATE OPERATOR 593 


the operator A* is no longer symmetric on D(A*) and has the eigen- 
value ý or (—i), or both. 

We could have taken any non-real number 4 of the upper half- 
plane and its conjugate 2 instead of +4. In this case the formulae 


y=(A +48) a; 2=(A+7E)2 (€D(A)) 


perform a one-to-one mapping of D(A) onto certain subspaces L,(A)} 
and 1;(A), and we have the isometric mapping z = Uy of the first 
onto the second. 

The complementary subspaces M,(A) and M;(A) are subspaces of 
solutions of the equations A* x = —Az and A* x = —Az. Let (pı qa) 
denote the dimensionalities of these subspaces. These are the deficiency 
indices of A. It will be shown below that they do not depend on the 
choice of 4 from the upper half-plane. Formulae (129) and (130) 
take the form 


Tg = Ta T ly, — Vin,» Buy = Ax, — hay, + AVay,. 
The first gives D(B), which can obviously be written as 
D(B) = D(A) + (E — V) M,, (129,) 


where N, is a subspace of M,(A) and V is the isometric operator 
transforming N, into the subspace Nj, lying in M3(4). 

Let Ly (k = 1, 2, ..., n) be lineals. The direct sum of the Lp is 
defined as the set L of elements that are expressible as x = x, + x + 
+ ...+2%,, where x, € Ly, if this expression for v is unique (L is 
a lineal). Formula (129,) provides an example of a direct sum. The 
direct sum is often written in the form L = L, + L, (a point over 
the -+ sign). 


203. The conjugate operator. We established in Theorem 3 the 
connection between the subspaces introduced and the conjugate 
operator. We shall end by explaining the composition of D(A*) and 
the connection between A* and A. Let za, x; and x; denote any 
elements of D(A), M;:(4) and M_,(A). We have the theorem: 

THEOREM. The formula 


v= Ta F Tit ti (131} 
gives the whole of the lineal D(A*) and 
A*v = Ax, + iz, — iti. (132) 


Every element of D(A*) is uniquely expressible by (131). 


594 HILBERT SPACE [203 


It follows at once from the fact that A* is an extension of A and 
Theorem 3 of [202] that the elements v defined by (131) belong to 
D(A*) and that (132) holds. Let us show that v is uniquely expressible 
by (181), ie. that 

tate,+2_,=0 (133) 


implies that all three terms are zero elements. 

Application of the operator A* to (133) gives Av, + ix; — ix_; = 0, 
and, on multiplying (133) by ¿ and adding to this last equation, we 
get (A + iH) x, + 2i2; = 0. The terms on the left-hand side are 
mutually orthogonal, so that both vanish, i.e. z; = 0. Similarly, on 
multiplying (133) by (—?), we get z_;=0 and, in view of (133), 
xa = 0. It remains to show that every element v of D(A*) is expres- 
sible by (181). Such an element is characterized by the fact that 
(Ax, v) = (x, v*) for all 2 of D(A), or, in view of (124) and (125), 
(iy + Uy), v) = (y — Uy, v*), whence it follows that 


(Uy, v® — iv) = (y, v* + w) (134) 

(y € L,A)) . 
On projecting onto mutually orthogonal subspaces, we can write 
v* — w = Tr; + 2i v* + iw = Ti + ti, (135) 


where x,, € L(A) and xr, € L(4). On substituting in (134) and 
using the fact that v_;1 Uy and 21 y, we get (Uy, s1) = 
= (y, £4) or, on putting 2,, = y’ and 2,_, = Uy”, where y’ and y” 
belong to L,(A), we can write (Uy, Uy") = (y, y’); further, since U 
is isometric, (y, y” — y’) = 0 for any y of L,(A), and in particular, 
for y = y” — y’, which leads to the equation y” = y’, i.e. £, = 4’ 
and a, ,= Uy’. On substituting in (135) and subtracting these 
equations term by term, we get v = (1/27) (y’ — Uy’) + (1/22) a, — 
(1/27) z_;, whence it follows that v is expressible by (131), since 
(1/22) x; € Mj(A) and (—1/2%) xi € M_,(A). 

A corollary of the theorem must be mentioned. It follows directly 


from the theorem that 
w= (4* — iE) v (136) 


transforms the lineal D(A*) of elements v into H. For, by (131) and 
(132), we get w = (A — iE) ta — 2ix_,, the first term being any 
element of L_;(A) and the second any element of the complementary 
subspace Jf_,(A). Thus, if we interpret (136) as an equation in v, 
it has solutions for any w € H. The homogeneous equation (A* — 


204] MAXIMAL OPERATORS 595 


— iE) v = 0 now has the subspace M,(A) as a solution. If it is non- 
empty, i.e. the first deficiency index p # 0, (136) has an infinite set of 
solutions v = v, + z; for any w, where v, is any particular solution 
of (136), and z; any element of M,(4). A particular solution v, has 
the form (131), and, in view of the possibility of adding to the solu- 
tion any element of M;(4), we can assume that v does not contain 
t; ie. the solution of (136) can be written as v = (æa + t-i) + £o 
where x, and x; are definite elements and 2; is an arbitrary element. 
If p = 0, z; is missing, and a definite solution is obtained. Similarly, 
the equation 
(A* + if) vo’ =w 


with an arbitrary w’ of H, has the general solution v’ = (x, + 
+ zi) + zL; where x, and g; are definite elements, whilst x^; is 
arbitrary. If the second deficiency index q = 0, x}; is absent. 

Notice that the formulae hold: v = xa + xı + zj; A*¥ v = Az, — 
— Ax, — àz (Jå > 0), analogous to (131) and (132). 


204. Maximal operators. A simple method can be given for forming 
maximal operators. We choose a complete orthonormal system in H: 


Ly, Ta.» (137) 


and define an isometric transformation U by the formula Ua, = 
= I+ (k = 1, 2, ...), ie. given any element y of H: 


Y = È t (Zial < +æ) 


we have 
Uy = PA Tki: 


Following the notation of [201], we can say that L’ is H, whilst 
L” is formed by all the base vectors (137) except for x,, and formulae 
(124) and (125), in which y is any element of H, lead us to a closed 
symmetric operator A with deficiency indices (0, 1). We only have to 
verify that the lineal D(4) formed by the elements y — Uy is dense 
in H. Obviously, all we need is to prove the existence of elements 
x of D(A) such that the norm || 2, — z {| is as small as desired, where 
2, is any given base vector. We form the element 


1 
m—s 


m 


m 


y = 


s=0 





kts 


596 HILBERT SPACE [204 


where m is a positive integer. We have 








m—l P m—l 
m— m— s8 
z=y—Uy= > -= Tes — È m. vktst1 = 
= s= 


1 m 
= Tk — Ty D Bhs 
m s=l 


whence it follows, by Pythagoras’ theorem and the fact that the 
x43 are normed, that 


1 Š 1 
lz — 21? = ae È ll teas = r> 


and, on letting m increase indefinitely, we get values as small as desired 
for || zx — x ||, which is what we set out to prove. A maximal operator 
of this type is called an elementary symmetric operator. If we put 
B = —A, D(B) becomes equal to D(A), and we get instead of (119) 
and (120): y= (—B+iE)2,z=(—B— iE) zx, whence it is clear, 
on replacing x by (—2), that (B + iE) « maps the lineal D(B) 
onto the subspace L.A) and (B — iE)v maps D(B) onto 
L;(4), ie. when A is replaced by (— 4), L; and L_; change places, 
and consequently, if A is an operator of the type described, with 
deficiency indices (0, 1), (—A) has deficiency indices (1, 0). 

Let U, be a unitary operator transforming the base vectors (137) 
into the base vectors xp. Application of the above method to the 
x gives us an isometric operator U’ aj, = tky} and an elementary 
symmetric operator <A’, where obviously, U’ = U,UU9 1 A’= 
= U, Uş, and D(A’) is obtained from D(A) with the aid of the 
operator U,. It can be shown that, if A is any closed symmetric 
operator with deficiency indices (0, g), where g > 0 and is finite, then 
H can be written as an orthogonal sum of subspaces: H = L,® 
@OL,@L,@ ... © Ly, reducing the operator A and such that each 
L, with k > 1 is infinite-dimensional, whilst the operator Ap, induced 
by A into Lp, is an elementary symmetric operator; the subspace 
Lẹ, which may in fact be absent, may be either infinite- or finite- 
dimensional, and the operator A, induced into it is self-conjugate. 
A similar result holds when q = œ. 

In the case of deficiency indices (p, 0), where p > 0, the A, are 
the elementary symmetric operators with reversed sign. 


205] EXTENSION OF SYMMETRIC SEMI-BOUNDED OPERATORS 597 


205. Extension of symmetric semi-bounded operators, Let A be a 
symmetric semi-bounded operator with lower bound ma: 


(Az, x) > malz, 2). (138) 


We shall assume for the present that ma > 0, i.e. that A is a posi- 
tive definite operator. We shall try to widen it in such a way that it 
remains symmetric and its range of values becomes the whole 
of H. As we know [187], such an extension leads to a self-conjugate 


operator. 
We associate the operator A with a real-valued quadratic functional 
J (x) = (Az, T) a) (y, x) = (z, y) , (139) 
rED(A), 


where y is any fixed element of H, and we consider the question of 
its minimum. 
THEOREM 1. If the equation 
Az=y (140) 


has a solution x € D(A), then Jy(x) < Jy(z), where z is any element 
of D(A) and the = sign holds only with z = x. Conversely, if, for a given 
x € D(A), Jy(x) < Jy(z), where z is any element of D(A), then x satisfies 
equation (140). 

Let x € D(A) and satisfy (140). Given any z € D(A), we have, 


since A is symmetric, 
J (z) = (Az,z) — (y, z) — (z, y) = (Az, z) — (Ax, 2) — (z, Ax) = 
= (A(x — z), £ — z) — (Ag, 2) = (A(x — z), £ — 2) + J, (2) 
and the first part of the theorem follows from 
(A(z — 2),4—2z) > ma || £ — z|? 
and ma > 0. 
Conversely, if x € D(A) and J,(z) < J,(z) for all z € D(A), the 


quadratic function J,(z -+ tz) of the real parameter ¢ has a minimum 
at t = 0 for any fixed z € D(A). Hence it follows that 


(Az, x) + (Az, 2) =a (Y, z) = (z, y) =0, 
or 
(Az, z) + (z, Az) B (Y, z) E (z, y) = 0, 


ie. R(Ax — y, z) = 0, where @ is the sign of the real part. On 
replacing z by iz, we get .7(Ar — y, z) = 0, i.e. (Ax — y, z) = 0, 


598 HILBERT SPACE [205 


and, since D(A) is dense in H, we get Az — y = 0, which proves the 
second part. 

This theorem shows that, given any y € H, the functional J,(z) 
attains a least value for some x € D(A). As regards this, we shall 
subsequently widen the domain of definition of our functional. In 
accordance with (139), it is defined on D(A). 

We introduce a new scalar product into D(A) by putting 


[z,y]a = (Az, y) or simply [x,y] = (Az, y), (141) 
so that the new norm is 
|||, = [2,2] = (4z, 2). (142) 
We have by (138): 
lell < lela: (143) 


It is easily verified that, given the previous definition of multiplica- 
tion by a number and of addition, and with the scalar product (141), 
the elements of the lineal D(A) satisfy all the Hilbert space axioms, 
excepting possibly the axiom of completeness. If this axiom is not 
satisfied, we can fill out D(A) with new ideal elements so as to obtain 
a complete Hilbert space, which we shall write as H4. We must inves- 
tigate this completion. Let a fundamental sequence of D(A) be given, 
with scalar product (141). By (143), it will also be a fundamental 
sequence in H, and will have some limit x’ in H, since H is complete. 
Fundamental sequences of D(A) with scalar product (141), belonging to 
the same class, lead to the same element z’ of H, i.e. if zn and yp, € 
€ D(A) (n=1,2,...) and ||Yn— znila—>0 as n—> œ, then 
|| Yn — 2n || —> 0 also. This follows from (143). We must also show 
that distinct elements 2’ and y’ of H correspond to fundamental 
sequences 2, and Yn of D(A) with scalar product (141), belonging to 
different classes. 

We have for any z € D(A): 


(Az, En — Yn) = [2% 2a — Yn). 
Passage to the limit gives 
(Az, £ — y') = [z, 2], 


where the v on the right is a non-zero element of H4, since the sequen- 
ces x, and y, belong to different classes in H4. If it turned out that 


205) EXTENSION OF SYMMETRIO SEMI-BOUNDED OPERATORS 599 


xv’ = y’, we should have [z, v] = 0; but this is impossible, since D(A) 
is dense in H4, and the element v € H4 is non-zero. 

Let us extend the definition of the functional J,(z) to the whole 
of H, by putting 


J, (2) = [2,2] — (y, 2) — (z, y), (144) 


and investigate the above minimum problem for it. Given any fixed 
y € H, (x, y) is a linear bounded functional on æ in Hy, since 


Kæ yi < llle < 7 Hl lL 


and, by a familiar theorem [123], there exists a unique element 
za € Ha such that 


(x, y) = [%,%], (EHK YEH) (145) 
and hence (y, £) = [£o £]. Expression (144) can be written as 
Jy (x) = [x, £] — [%, £] — [%, Xo] = [£ — to, £ — Lp] — [%p, £o], 
whence it follows that Jyxo) < Jx) for all x € Ha, where the = sign 
only holds for x = 2. Moreover, it follows from (145) and the fact 
that H, is dense in H that distinct x, correspond to distinct y € H. 
It follows from the same equation (145) that the set of all solutions 
Zo of the variational problems corresponding to all possible y € H 
is a lineal 1. What has been said enables us to define a distributive 
operator A on J in accordance with 


Är =y, (146) 


and this operator has an inverse defined on the whole of H. By Theorem 
1, X is a widening of A. It is natural to write D(A) instead of l. Notice 
that D(A) & Ha. Let us show that A is a symmetric operator in 
H. It follows from (145) that 


(Ay, £) = [z0 2] (%€ D(A); EH), (147) 
and we have, on putting z = z: 
(Arp. To) > 0, 


i.e. (z, Zo) is real for x, € D(A), whence follows the symmetry of 
A [187]. The inverse X- is defined on the whole of H and is also sym- 
metric, i.e. it is a bounded self-conjugate operator, i.e. A is also self- 
conjugate in H. We have thus proved the following: 


600 HILBERT SPACE [205 


THEOREM 2. A symmetric operator A, satisfying condition (138) 
with ma > 0, admits of a self-conjugate extension A such that A-} is 
defined in the whole of H and is bounded. 

The construction given above for the widening of a semi-bounded 
symmetric operator is due to Friedrichs (Math. Ann., 109, 4/5, 
1934). The proof is taken from the book by S. G. Mikhlin The Problem 
of the Minimum of a Quadratic Functional (Problema minimuma 
kvadratichnogo funktsionala) (1952). 

Now suppose that the symmetric operator A satisfies condition 
(138) with ma < 0. Now, the symmetric operator B = A + (e — 
— ma) E (e > 0), for which D(B) = D(A), satisfies the condition 
(Bz, x) > e(z, x) for 2 € D(B), and the above theorem leads us to 
the following: 

THEOREM 3. Every symmetric semi-bounded operator A admits of 
a self-conjugate extension A, such that, given any e > 0, the operator 
[A + (e — ma) E] is defined throughout H and is bounded. 

If A is not self-conjugate, it admits of an infinite set of self-conjugate 
extensions. The extension A obtained is usually called a Friedrichs 
extension. 

If x€ D(A), we had (143) with ma > 0. Suppose that x € Ha, 
but does not belong to D(A). There now exists, by definition of Ha, 
a sequence ta € D(A) (n = 1, 2, ...) such that z, => x in the norm 
of H, and all the more in the norm of H. 

By writing the inequality for z, and using the continuity of the 
norm, we can say that, with ma > 0, inequality (143) is true for the 
whole of Ha. 

If x € D(A) and hence z € Ha, it follows from (145) that (Az, x) = 
= [z, x] = || z ||, and inequality (143) gives 


(Ax, x) > m,(2, 2). (148) 


We have assumed during the proof that m, > 0. If ma < 0, we 
form the operator B = A + (e — ma) E, where e > 0. By what has 
been said, we have (Bz, x) > elx, z), where B = A + (e — ma) E, 
whence (148) follows. We thus have the following result. 

THEOREM 4. Inequality (148) holds for A. 

We now take the case m, > 0 and prove the following: 

THEOREM 5. Given a positive definite symmetric operator A, we have 
Ha = D(A) and 


1 
[z, a] = || Æ z]. (149) 


205] EXTENSION OF SYMMETRIC SEMI-BOUNDED OPERATORS 601 


We first show that H, c D(A!?). Let x € Ha. There exists a 
sequence zn € D(A) (n = 1, 2, ...) such that || x — a, |l4— 0, and 
the equation holds for z,: 


1 +o 
[En £n] = (Az, Ip) = il 2? Tn l = f Ad(F 2p, Ln) (150) 
m,—0 


where %, is the spectral function of A. 
On noting that 


i 
|£ — z, ||—> 0, || En be Lm lla = I A (x, res Lm) |P—> 0 
and that 41? is closed, we can say that x € D(A?) 


1 1 
Ax, => A’ x and || A? «|| = |ie 
(continuity of the norm), i.e. equation (149) holds for x. To complete 
the proof, we have to show that D(A?) c Ha. Let a € D(X), 
and let us show that x € Ha. 

We take the sequence of elements z, = En x (n = 1, 2, ...). It 
is clear that x, € D(A) and || £ — £n || —> 0 as n— æ. 

In view of the convergence of the integral 


+20 
f Ad(Z,, x, £) 
m,—-9 
the elements 


1 n 
Zr, = § \AdF,x 


m,—0 


are convergent to A! x in H as n— ©, so that 


1 ~ 
Il £ (Tn Lm) ||? = (A(z, = Tm)» Ta — Lm) = I Tn — Lm la > 0 


as m and n — œ. This means that z, form a fundamental sequence 
in H,. Let z be the corresponding element of H4. Hence 2, => 
in Ha. But, as we have seen above, z, > xin H, so that g =z € Ha, 
which is what we set out to prove. 

When ma > 0, the operator A-t is defined throughout H and is 
bounded. We shall now take the cases when it is completely continuous. 
Let us bring in the operator W, which associates each element z € Ha 
with the same element z, regarded as an element of H. 

THEOREM 6. The necessary and sufficient condition for A-1 to be a 
completely continuous operator is that the operator W which embeds 
Ha in H be completely continuous. 


602 HILBERT SPACE [206 


Necessity. Let A-1 be completely continuous. Its spectrum is 
a purely point type, and it lies in the interval [0, 1/m,], where the 
eigenvalues, excepting possibly zero, have finite rank, and only the 
point å = 0 is a point of condensation of the spectrum [136]. The 
spectrum of the self-conjugate positive operator Ą-¥2 has the same 
character: each eigenvalue / is replaced by 4 and the eigenelements 
remain as before. 

Hence A-¥? is also a completely continuous positive operator. 
We take any set U, bounded in Hy, such that, if x € U, then || æ || < 
< 0, where C is a definite constant. We can apply the operator 
Au? to æ. Let y= Al? x, so that z = A-12y. We have || y |? = 
= (A1% a, AV x) = || z ||, < 0?, ie. the set of elements A) g is 
bounded in H, andthe completely continuous operator A-1/? trans- 
forms them into a compact set in H, i.e. a set of H4, bounded in the 
norm of H4, is in fact compact in H, i.e. the operator W is completely 
continuous. 

Let us prove the sufficiency. Let W be a completely continuous 
operator and U a set of elements bounded in H: if x € U, then || æ || < 
<0. We have to show that y = A-1z is a set compact in H. The 
elements y obviously belong to D(A), and we have || y ||4 = (Ay, y) = 
= (x,y) < C || yl < (C/Vma) Ily Ila, whence || y Ila < O//ma, i.e. 
the set is bounded in the norm of H 4, and hence, since W is a comp- 
letely continuous operator, is compact in H. The theorem is proved. 


206. The comparison of semi-bounded operators. Let A and B be 
semi-bounded self-conjugate operators. We say that A is not less 
than B, and write A > B, if D(A) © D(B) and 


(Ax, x) > (Bx,x) for xe D(A). (151) 


If A and B have purely point spectra and the eigenvalues of A 
and B can be enumerated in non-decreasing order, whilst taking into 
account their multiplicities, it can be shown by extending the minimax 
principal [136] to the case of non-bounded operators, that A,(A) > 
> 4,(B), where A,(A) and A,(B) are the nth eigenvalues of A and 
B. Let us prove a rather more general theorem. 

THEOREM. Let A > B, and let the spectrum of B, situated on the half- 
line A < B, where B is a certain number, consist of eigenvalues of finite 
multiplicities, which have no points of condensation, less than B. The 
spectrum of A now has the same properties and A,(A) > 4,(B) on the 
half-line. 


206] THE COMPARISON OF SEMI-BOUNDED OPERATORS 603 


It is sufficient to show that, given any ô < 8, where ô differs from 
an eigenvalue of A or B, the dimensionality of the subspace cor- 
responding to the projector 2, is not greater than the dimensionality 
of the subspace corresponding to the projector F,, where @, and F, 
are the spectral functions of A and B: 


dim g, H < dim F,H. (152) 


Let us assume the reverse inequality. There must then exist in 
Z, H a normalized element z, orthogonal to the whole of F, H. Notice 
that x€ D(A), and hence x, € D(B), since 2 € Z H = (E; — 
— @m,-o) H. We have 


+00 ð 


(Azo, £o) = f Ad(F y Lp, Xp) = J Ad(B to, 20) < 6|| x || =ô. (153) 


ma? mMa- 
On the other hand, since za L Fs; H, we have 


+o +00 
(Bay, z) = f A Fa os fo) = J 2A( F5 Zo, To). 
mp- 
But, inasmuch as F, is constant in some neighbourhood of the 
point A = 4, there exists an € > 0 such that 


(Bto, %) = oS Ad( FX, to) >te. 
je 


This inequality contradicts (151) and (153), so that (152) is proved. 

Note. Let us return to symmetric semi-bounded operators. Let 
A be such an operator (not necessarily positive definite). Let us define 
the subspace H, for it. Let a be any number satisfying a > —ma, 
so that the operator A + aE is positive definite. We shall assume 
that H, consists of all the elements of H4+ag and we introduce 
into H, the bilinear functional [x, y],4, whichis an extension of (Az, y) 
to the whole of Ha: l 


[z, Ya a [x, YlatoE ~~ a(x, y) g (154) 


The functional [x, y]a is continuous in Ha. It may easily be shown 


that 
+ 


[z y]a =] Ad(%, x,y) , (155) 


47-9 


where %, is the spectral function of the self-conjugate operator A. 
Notice also that HA+ag consists of the same elements for alla > —ma. 
This follows from the fact that D(A + aE) does not depend on a, 


604 HILBERT SPACE [207 


and the norms of Ha... are equivalent for all a > —ma. Notice that 


the operator A may be self-conjugate. 

It may easily be shown that (151) is equivalent to the condition 

[z, x]; > [#,2]z, 

and that the spectrum of the self-conjugate operator A on the half- 
line A < $, where the role of # is indicated in the theorem, can be 
found as the successive lower bounds of (Az, x) for x € D(A) and 
|| x || =1, on condition that x is orthogonal to the eigenvalues already 
found [cf. 136]. This problem can be replaced by the problem of the 
successive minima of [z, x], for x € Ha and || æ || = 1. 


207. Examples on the theory of extensions. 1. We have shown that the 
operator D = id/dz in space H = L,(0, + œ) has no self-conjugate extensions 
[188]. We shall adhere to the notation of [188] and prove this result by using 
deficiency indices. Remember that a closed symmetric operator A is the operator 
D defined on the set of functions g(x), absolutely continuous in any finite 
interval [0,a] with derivatives of L, (0, + œ) and satisfying the condition 
p(0) =0. The operator A* is the operator D on the set of functions p(z), 
satisfying the above-mentioned conditions except for (0) = 0. 

Let us form the spaces M;(A) and M_,(A) of eigenelements of the operator 
A*, corresponding to the eigenvalues +i, i.e. the subspaces of solutions of 
the equations A* y(x) = +iy(x) or ip’(x) = +iy(x). We get y(x) = Ce* and 
p(x) =Ce-*. But e” does not belong to L,(0, + œ), and we see that the deficiency 
indices of A are (0, 1). The operator A is a maximal operator, L,(A) is the whole 
of L,(0, +œ) and L_,(A) consists of the functions belonging to L,(0, + 0) 
and orthogonal to e~* on the interval (0, 4- œ). 

If we introduce the orthonormal system of Laguerre functions: p(x) = 
=e * p(x) (k = 0,1, 2,...), where p,(z) is a polynomial of degree k, it is 
easily shown that Up,(x) = 94,(x), where U is an isometric operator map- 
ping L,(A) onto L_,(A), i.e. A is an elementary symmetric operator. 

2. Let us take the operator L(y) = —y” in space H = L,(— œ, + œ). Let 
A denote this operator on the lineal D(A) of finite functions, having continuous 
derivatives up to the second order. This operator is symmetric. As may easily 
be seen, the conjugate A* is the same operator L(y) on the lineal of functions 
y(x) with the following properties: y(x) and y’(z) are absolutely continuous in 
any finite interval, whilst y(x) and y’(z) € Z,{— œ, + œ). It can be shown 
that here, y’(x) € La(— œ, -+ œ) also. The operator A** = 4 coincides with 
A*, ie. A is a self-conjugate operator (cf. 188]. The equations —y” = aly 
have no solutions in L,(— œ, + œ). Let us now take the same operator L(y) 
on the interval [0, + œ]. Let lI’ be the lineal of functions y(x) with the fol- 
lowing properties: y(x) and y’(z) are absolutely continuous in any finite interval 
[0, a], whilst y(x) and y”(x) € Z,(0, + œ). Let us also find the lineal Z of the 
elements y(x) of l’ which satisfy the conditions y(0) = y’(0) = 0 and 

lim (—y’2 + ¥’) =0, 
Xo 


for any 2(x) eU. 


207] EXAMPLES ON THE THEORY OF EXTENSIONS 605 


If A is the operator L(y) on l, A* is the same operator on l’, where A is a 
closed symmetric operator. The equations —y” = +7y each have one solution 
in Ł,(0, + œ) (discounting a constant factor): 


- p {lz i)x 
y=e , 
so that the deficiency indices of A are (1, 1). To obtain a self-conjugate extension, 
we have to impose a boundary condition at the end x = 0. In the case of the 
condition y(0) = 0, the operator has no point spectrum, and the continuous 
spectrum fills the interval 4 > 0. There exists a unique differential solution 


y(x) = + (va cos Ax — sin ya z) ; 


On forming the resolvent, i.e. the solutions of the equation —y” +- (o + 
+ tt) y = f(x) with the conditions y(0) = 0 and +> 0, and passing to the 
limit, we get the spectral function 


+æ 7 
& He) = — +f Ae T E 
0 

All these results are obtained with the aid of simple working. It is easy to 
show that, if y(x) € D(A*), then y’(x) € L,(0, + œ). The theory of linear dif- 
ferential operators of the second order will be discussed in vol. VI. 

3. In [188] we considered the Laplace operator in space L,(D). All the condi- 
tions of Theorem 3 of[187]are fulfilled for the operator A given there, and we 
shall consider the self-conjugate Friedrichs extension of A. The space H, is 
obtained by completing D(A) in the metric 








ella = Vidu, u) = Vs —Au-u dz = F jux]? dar. 


But this norm is equivalent to the norm of WD) [114], so that H, is 
WD). Remember that WD) is obtained by completion in the norm of 
wo(D) of the set of all once continuously differentiable finite functions. But 
it may readily be seen (averaging process) that we could take as the initial set 
infinitely continuously differentiable finite functions. The functional J(u) 
on u € D(A) has the form 


J;(u) =f [—Au-a@— 2R (uf)] de, (156) 
D 
where f(x) € LD), or 


Jy (u) =f [E jux? — 2R (uf)] de. (167) 
D i 


It has a meaning in this form for any function u € WD), and the variational 
problem of [205] consists in finding the function u € wD) which gives the 
least value of functional (157). We have seen that this problem has a unique 
solution for any f(x) € LD). On associating all these solutions, obtained with 


606 HILBERT SPACE [208 


different choices of f(x) € L,(D), with D(A), we arrive at the self-conjugate 
extension 4 of A, where 


Au =f. (158) 


Since 4 is a self-conjugate extension of A, we have D(A) © D(A*). But 
functions of D(A*) have generalized derivatives up to and including the second 
order inside D, which are square summable over any domain D’ lying strictly 
inside D, and the operator A* is evaluated on them as a Laplace operator 
[188]. Hence A is also a Laplace operator, i.e. equation (158) has the form 


— 2 Ure = f(z). (159) 


We have thus shown that the solution of the present variational problem 
belongs to WD’) as well as to WOND), where D’ is any domain lying strictly 
inside D, and that it satisfies Poisson’s equation. 

On the other hand, equation (159) has an infinite set of solutions of L,(D). 
It is enough to add to the solution given above a harmonic function of L,(D). 
The condition that the solution belong to WOD) distinguishes one solution 
from this class; we in fact obtained this solution from the variational problem. 
This solution must vanish in a definite sense on the boundary S of the domain 
D [113]. This makes clear the connection of our extension of A with the Dirichlet 
problem for Poisson’s equation: 


— 4u = f(x); uls=0. (160) 


Everything proved above for the Laplace operator is also valid for general 
linear elliptic self-conjugate operators of the second order [IV; 147]. The theory 
of Friedrichs extensions reveals the solubility of the Dirichlet problem for them 
in a generalized sense — in the sense of the solution belonging to W{)(D), 
In actual fact, it turns out that this generalized solution of the Dirichlet problem, 
corresponding to a Friedrichs extension of an elliptic operator, belongs to W?)(D’) 
or even to WD), provided only that S is a sufficiently smooth surface. This 
was established by O. A. Ladyzhenskaya in The Closure of an Elliptic Operator 
(O zamykanii ellipticheskogo operatora) (Dokl. AN SSSR t. 79, No. 6, 1951). 
Ladyzhenskaya’s article “A simple proof of the solubility of boundary value 
problems and of eigenvalue problems for linear elliptic operators’’ (Prostoe dok- 
azatel’stvo razreshimosti kraevykh zadach i zadachi o sobstvennykh znach- 
eniyakh dlya lineinykh ellipticheskikh operatorov) (Vestnik Leningradskogo 
universiteta, No. 11, 1955) is concerned with the same problem. 


208. The spectrum of a symmetric operator. We introduced earlier 
the concept of the spectrum of a self-conjugate operator and establ- 
ished a classification of its points. We shall do the same thing in the 
next sections for a closed symmetric operator, and investigate the varia- 
tion of the spectrum with symmetric extensions of the operator. 

Let A be a closed symmetric operator. The number 4 is called a 
point of regular type of operater A if there exists a k > 0 such that, 


208] THE SPECTRUM OF A SYMMETRIO OPERATOR 607 


for all x € D(A): 
| (A — åE) || > klizi] (ee D(A)). (161) 


In view of (161) and the fact that A is closed, R(A — AE) must 
be a subspace and (A — AE)-1is a linear operator bounded on R(A — 
— AE). Conversely, if (A — A#)-1 exists on R(A — AE) and is bounded, 
(161) follows from this, i.e. Ais a point of regular type. It is easily shown, 
as in [129], that the set of points of regular type is open. The number 
A is called a regular point of A if (161) is fulfilled and R(A — AE) 
is the whole of H. If R(A — AE) = H, å is not an eigenvalue of A, 
since R(A — AE) must be orthogonal to the eigenvalues, and (A — 
— A4E)- is an operator bounded in H [186], i.e. (161) is satisfied. 

If A is a real regular point of A, (A — AE)-} is a bounded self- 
conjugate operator, so that A is self-conjugate. Let us show that 
the set of regular points is open. It is enough to show that, if A, is 
regular, the equation 

(A—AE)x=y (162) 
is uniquely soluble for any y € H, provided 2 is sufficiently close to 
A. Suppose that 

1 
al < aT 
We rewrite (162) as 
(A — E)r + (Ay — Ale =y, 
which is equivalent to 
a= (A—A)(A—A EB) 12+ (A-AE) y, 
whilst this latter is uniquely soluble for any y € H [88], since 
|| (A — Ag) (A — AQ BE) I <1. 

It may be shown as in [129] that, if A= o + ti and t # 0, 

\(A — 2B) xl) > [el llall @eD(A)), (163) 
i.e. all non-real A are points of regular type. 

Suppose that A, = o + ti, where t # 0, is a regular point. Now, 
by (163) and what has been said regarding the solubility of (162), 
all the å satisfying | å — o| < |z} will also be regular points. 
On starting out from the regular value 4, and applying the argument 
just given a suitable number of times, every non-real A =o’ + v'i 


in which the sign of t’ is the same as the sign of 7, is seen to bea 
regular point. This proposition can be stated as follows: 


608 HILBERT SPACE [208 


Lemma. If one of the deficiency indices p, or q, of an operator A vanishes 
for à = 2, (72. > 0), it vanishes for all A of the half-plane JA > 0. 

All non-real 4 are regular points in the case of a self-conjugate 
operator. Let us give an example of a closed symmetric operator 
A which has no regular points. Let H be L,(0, 1) and A be the operator 
id/dz, considered on the set of functions g(x) such that p(x) is absolut- 
ely continuous in the interval [0,1], 9(0) = (1) =0 and g'(x) 
belongs to L, (0, 1). This is a closed symmetric operator [188]. Given 
any choice of A, the function e~! belongs to Z,(0, 1) and is ortho- 
gonal to all the functions y(x) expressible in the form 


ylz) = 1 HO — pay, 


where g(x) € D(A), i.e. e- is orthogonal to R(A — AE), whence it 
follows that A is not a regular point. 

We define the spectrum of A as the set of points of the 4 plane 
complementary to the set of regular points. This is the set of the points 
A at which (A — AH) has no bounded inverse defined throughout 
H. The kernel of the spectrum of A is defined as the set of points 
complementary to the set of points of regular type. The spectrum and 
the kernel of the spectrum are closed sets, and the former (the spectrum) 
contains the latter (the kernel). The kernel of the spectrum must lie 
on the real axis. The spectrum may fill the entire plane, as is clear 
from the above example. 

If A is a self-conjugate operator, the kernel of the spectrum coincides 
with the spectrum [189]. 

It may easily be seen that the kernel of the spectrum of A belongs 
to the kernel of the spectrum of any closed symmetric extension of 
A. This follows from the fact that, if A belongs to the kernel of the 
spectrum of A, this is equivalent to the existence of a sequence z, 
of normalized elements of D(A) suchthat(4—AE) t,=>0 as n>, 
This property is obviously preserved with the above-mentioned exten- 
sions of A. 

We can now classify the points of the kernel of the spectrum of an 
operator A, As a preliminary, we take the case when 4 is an eigen- 
value of A (A is a real number). Let P, be the subspace of correspond- 
ing eigenelements (including the zero element). We can write the lineal 
D(A) as the orthogonal sum 


D(A) =P, @®D,(A), (164) 


209] SOME THEOREMS ON EXTENSIONS AND THEIR SPECTRA 609 


where D,(A) is the lineal consisting of elements contained simultan- 
eously in H © P, and D(A). Let A, denote the operator A defined 
on D,(A) and coinciding with A on this lineal. If 4 is not an eigen- 
value, P, is absent and D,(A) is the same as D(A). We shall assume 
in this case that A, is 4. We can say that (A, — AZ), regarded as 
an operator in H © P,, has, for any A, an inverse (A, — AE)-}, 
defined on R(A — AE). The å for which (A, — AE#)-1 is an unbounded 
operator belong to the kernel of the spectrum. This part of the 
kernel is described as the continuous part. The eigenvalues also 
belong to the kernel, and this is described as the point part of the 
kernel. Every point of a spectral kernel belongs to one of these parts, 
though it may belong to both. We shall say that an eigenvalue belongs 
to the purely point part of the kernel if (A, — AE)-! is bounded on 
R(A — AE). Every point of the kernel belongs either to the continuous 
or to the purely point part of the kernel, and indeed, only to one of 
these parts. When A is given a closed symmetric extension, the con- 
tinuous and point parts of the spectral kernel can only be widened. 
It may easily be seen that, when A and A, are closed, the continuous 
part of the spectrum of A is characterized by the fact that R(A — AZ) 
is a non-closed lineal. 


209. Some theorems on extensions and their spectra. We shall 
start by proving the following theorem: 

THEOREM 1. If å is a real point of regular type of a closed symmetric 
operator A, a self-conjugate extension A of A exists for which 4 is a 
regular point. 

It can be assumed without loss of generality that å = 0. It follows 
from the conditions of the theorem that R(A) is a subspace, and the 
bounded inverse A-! is defined on it [208]. We have to show that a 
self-conjugate extension A of A exists for which R(A) = H [189]. 
Given the hypotheses, we have 


H = RA) OU, 


where U is the subspace of all solutions of the equation A* u = 0 
[185]. We know that, in the present case, R(A*) = H [187], so that 
there exists for any u € U at least one solution of the equation 


A*ty=u. (165) 


Let V denote the lineal of all solutions of (165), when u runs over 
the whole of U. Obviously, U © V S D(A*). Let U denote the lineal 


610 HILBERT SPACE [209 


of elements of V orthogonal to U, and let us form the lineal J of ele- 
ments x expressible as 
r=yti, (166) 


where y € D(A) and ù € &. Let us show that expression (166) for x 
is unique. If this were not the case, a non-zero element z would exist, 
belonging simultaneously to D(A) and Ù. But then Az € R(A) and 
Az = A*z€U, since z € U, and z € V. But B(A) 1 U and there- 
fore Az =0 ie. z€U, which in conjunction with z € U yields 
z= 0. This proves the uniqueness of expression (166), i.e. Z is the 
direct sum: D(A) + 0. We now define the operator A on lineal | 
by putting Ar = Ay + A* u, and write D(A) for l. Obviously, 
AGA C A*. Let us show that Ă satisfies all the requirements of 
the theorem. On the lineal D(A), the operator A coincides with A, 
and on U the operator A* gives the whole of U, as follows from the 
definition of Ŭ and the fact that A* u = 0 for u € U. Thus R(A) = H. 
It remains to prove the symmetry of A on D(A) [187]. Notice first 
of all that 


(A* ài, %) = (dy, A*i) =0 (a, and a, €0). 


Let 2, and z, € D(A). Now, by (166), 2% = Y1 + %, Ta = Yı F Uz, 
and we have 


(Az, ty) = (Ay, + A* ŭi, T3) = (Yp A* £) + (A* Hy. Y2) = 
= (y,, A* 2) + (iy, AY) = (Y A* 23) + (Či AY) + (Č, A* ù) = 
= (Yı A* £3) + (Č, Ax,) = (a, Aa). 


The theorem is proved. 

It can be shown that, if A~! is completely continuous, X -1 is also 
completely continuous. For a detailed study of such extensions, both 
of abstract and differential operators, see M. I. Vishik (Trudy Mos- 
kovskogo matematicheskogo obshchestva, t. 1, 1952) and L. Hörmander 
(Acta Mathematica, 94, 3—4, 1955). 

COROLLARY 1. If the real number A belongs to the purely point 
part of the spectral kernel, a self-conjugate extension A of A exists for 
which A also belongs to the purely point spectrum with the same subspace 
P, of eigenelements as A. 

It may easily be seen that A, considered in the subspace H, = 
= HOP,, is a closed symmetric operator and satisfies the conditions 
of Theorem 1. It thus admits of a self-conjugate extension 4, in H,, 
such that 4 is a regular point. The operator J, with the domain of 


209] SOME THEOREMS ON EXTENSIONS AND THEIR SPECTRA 611 


definition D(A) = D(A,) © P, coinciding with A, on D(A,) and with 
A on P,, is obviously the extension indicated in the corollary. 

COROLLARY 2. If A is a maximal but non-self-conjugate operator, 
the continuous part of the spectral kernel of A fills the entire real axis. 

For otherwise, A would have self-conjugate extensions. 

COROLLARY 3. If there exists a real 2, not belonging to the continuous 
part of the spectral kernel of A, the deficiency indices p, and q, are the 
same for any A (JÀ > 0). 

This follows from the fact that, given the hypothesis, A has self- 
conjugate extensions. 

THEOREM 2. Let the real A be a point of regular type of a closed 
symmetric operator A and U = HO R(A — AE). There now exists a 
self-conjugate extension A of A for which A belongs to the purely point 
spectrum and U is the eigensubspace corresponding to i. 

We can assume without loss of generality that 4 = 0. Notice that 
(A) and U are subspaces, U being the set of solutions of the equation 
A*z=0. Let D(A) + U denote the set of elements of the form 
£x =y +z, where y € D(A) and z € U. 

This form is unique. For otherwise, we should have a non-zero ele- 
ment Z, such that x € D(A) and x, € U. It follows from this that 
(£o Ax) = 0 (x € D(4)) and (Az, x) = 0. But, since D(A) is dense. 
in H, we get Ax, = 0, and this contradicts the fact that å = 0 is. 
a point of regular type. 

We can therefore define an operator A on the direct sum D(A) = 
= D(A) +U by putting Ax = Ay, if x = y + z, where y € D(A) 
and 2z € U. 

The symmetry of A follows immediately, since (Az, y) =0 and 


(z, Ay) = (2, Ay) = (A*2z,y) =0. 
Let us show that A is self-conjugate. Let 
(Ñx, u) = (z,u*) (x€D(A4)). (167): 


Since A* S A*, we can say that u € D(A*) and u* = A* u. On 
writing © as x = y + z as above, we obtain 


(Az, u) = (Ay, u) = (y, u*) + (z, u*) = (y, A* u) + (z,u*), 
or 
(Ay, u) = (Ay, u) + (z,u*), 


(z, u*) = (z, A* u) = 0. 


612 HILBERT SPACE {210 


This equation holds for all z € U, so that u* = A* u € R(A), and 
consequently there exists an element y, € D(A) such that Ay, = u*, 
whence A*u— Ay, = A*(u — Ya) = 0, ie. u — Yo =2%,€U and 
u = Yq + 2 € D(A). By (167), X is in fact self-conjugate. The remain- 
ing assertions of the theorem regarding the properties of A follow 
at once from its construction. 

COROLLARY. If the real A belongs to the purely point spectrum of A, 
a self-conjugate extension A can be formed, for which à also belongs 
only to the point part of the spectral kernel, where the subspace of eigen- 
elements of A corresponding to A is the same as the subspace of eigen- 
elements of A* corresponding to A. 

The proof is similar to the proof of corollary 1 of Theorem 1. 


210. The independence of the deficiency indices on 2. We remarked 
above that the deficiency indices p, and q; do not depend on the choice 
of the complex number å from the upper half-plane. We shall prove 
this assertion in the present section. We note first of all that, if a 
linear operator B (not necessarily bounded) maps one-to-one a 
subspace V onto a subspace W, V and W have the same dimensions: 
dim V = dim W. This follows from the fact that the linearly inde- 
pendent elements {z,, £2, ...} of subspace V are mapped by B into 
linearly independent elements {Bz,, Bx, ...} of subspace W, and 
vice versa. Let us introduce a further definition. We say that m is 
the dimensionality of the lineal 7 with respect to the modulus of 
the lineal V’ if l contains m and not more than m linearly independent 
elements, no linear combination of which belongs to l’ (excluding the 
case of all the coefficients vanishing). We usually write m = dim 1 
(mod 1’). The number m may in fact be infinite. 

Let A be a closed symmetric operator and pı, q, its deficiency indices 
corresponding to some A from the upper half-plane. Now, as we have 
seen [202]: 

a erie (16s 
H = L(A) © MYA), ) 


where M,(A) is the set of all the zeros of the operator A* — ÌE, 
L,(A) is the set of all the elements of the form y = (A — AE) x for 
x € D(A), and similarly for M3(4) and L;(A). 

Further [203]: 


D(A*) = DU) + M,(A) + M44). (169) 


210] THE INDEPENDENCE OF THE DEFICIENCY INDICES ON 4 613. 


and any symmetric extension of A may be formed with the aid of 
an isometric operator establishing a one-to-one correspondence. 
between subspaces V,(A) and Nj(A) (of the same dimensionality) of 
spaces M,(A) and M;(A) [202]. 

It follows from (169) that 


Pa + %, = dim D(A*) (mod D(A)), (170) 


so that the sum pı + qı is independent of A. Suppose first that it. 
has a finite value. We form a maximal extension A, of A and take 
a A =A’ of the upper half-plane. We can assume without loss of 
generality that p, <q,. Hence it follows that the deficiency indices 
of A, with A=J’ are (0, rx), where T} =q, — py. By the lemma of 
[208], the deficiency indices of A, are (0, 7,) for all A of the upper half- 
plane, where r, = q — pa [202]. It further follows from (169) that. 


D( A$) = D(Ay) + My(A)), 
and we have 


r, = dim M;(A,) = dim D(A%) (mod D(A,)), 


whence it is clear that r, = q, — p, is independent of A. Thus p, 
and q, are independent of A. Now suppose that p, + q, is equal to 
infinity. If we have p, = œ and q, = œ for some 4’, self-conjugate 
extensions of A are possible, and hence p, = œ, q, = œ for any A. 

It remains to consider the case when one deficiency index is finite 
and the other infinite for some A = 2’. 

Let p, be finite and q} = œ. It follows at once from the above 
that p, will be finite and q, = œ for every A. We only need to show 
that p, does not depend on å. This is easily seen from the formula 
[202]: 

D(A,) = D(A) + (E — Vo) M44), (171) 


where A, is a fixed maximal extension, and V ¿is an isometricoperator, 
dependent on the choice of A, and A. In view of the existence of (E — 
—V,)-1, it follows from (171) that 


p, = dim D(A,) (mod D(A)), (172) 


which shows in fact that p, does not depend on å. 

It follows from Theorem 1 [209] that, if a real A =A, exists, which is 
a point of regular type of a closed symmetric operator A, A admits 
of self-conjugate extensions, i.e. given the existence of a real point 
of regular type, the deficiency indices (p, q) (independent of A) are the 


614 HILBERT SPACE [211 


same. They are equa] to the dimensionality of the subspace U of 
eigenelements of A* corresponding to the eigenvalue A = Ap. 

For, on taking 4, = 0 and writing J for the self-conjugate extension 
of A for which 4 = 0 is a regular point [209], we have in the present 
case: H =R(A) @U; D(A) = D(A) + A-1U, the sums represent- 
ing the elements on the left-hand side being unique; consequently, 


p = dim D(A) (mod D(A)) = dim A~ U = dim U. 


211. The invariance of the continuous part of the spectral kernel 
in the case of symmetric extensions. We shall discuss in this section 
the closed symmetric extensions of a closed symmetric operator A, 
on the assumption that the deficiency indices (p, q) of A are finite. 

We start by proving a simple lemma. 

Lemma. If U and W are two subspaces, the second of which is finite- 


dimensional, then 
YV=U +W, (173) 


i.e. the set of elements x = y + z, where y € U and z € W, is also a 
subspace. 

We can obviously assume that W has no elements in common with 
U (except the zero element). Let (w,, Ws ..., Wn) be the base of W. 
We write each Wp as w, = wh + wg, where w € U and wg 1 U. Let 
W” denote the linear envelope of the wy (k = 1, 2, ..., n). The set 
V can be written as the orthogonal sum of two subspaces: 

V=UOW’, 
and the lemma is proved. 

THEOREM. The continuous part of the spectral kernel of any closed 
symmetric extension A of the operator A is the same as for A. 

We have seen that the continuous part of the spectral kernel cannot 
diminish on extension of A [208]. Suppose that it increases, i.e. a 
real number å, exists, which does not belong to the continuous part 
of the spectral kernel of A, but is contained in the continuous part 
of the kernel of A. Now, R(A — A, E) is a subspace, and R(A — A, E) 
a non-closed lineal. 

On taking into account the formula for D(A) [203] and the fact 
that the deficiency indices of A are finite, we can write 


R(A —4,E) = R(A—4, £E)+W, 


where W is a finite-dimensional subspace. But this last formula, and 
the fact that R(A — A, E) is non-closed, contradict our lemma. 


212) THE SPECTRA OF SELF-CONJUGATE EXTENSIONS 615 


212. The spectra of self-conjugate extensions. We have seen that, 
if A is a real point of regular type for A, there exist two types of self- 
conjugate extensions of A: for one, 2 is a regular point of A, and for 
the other it is an eigenvalue with multiplicity equal to the deficiency 
index of A (the indices are equal). 

Let us supplement these results. 

THEOREM 1. If the deficiency indices (p, p) of a closed symmetric operator 
A are finite, given any self-conjugate extension ‘A of A, the multiplicity 
of any eigenvalue can be raised by not more than p, and a real A which 
is not an eigenvalue of A cannot be an eigenvalue of A of multiplicity 
higher than p. 

Suppose that 4 is not an eigenvalue of A, but is an eigenvalue of 
A of multiplicity k > p. It follows from (172): 


p = dim D(A) (mod D(A)) 


and k > p, that there is an eigenelement of A belonging to D(A), 
i.e. Ais an eigenvalue of A, which contradicts our hypothesis. We have 
thus shown that k < p. The case when å is an eigenvalue of A is 
similarly considered, using the operator A, [208]. 

THEOREM 2. If A is a semi-bounded closed symmetric operator and 
its deficiency indices (p, p) are finite, the spectrum of any self-conjugate 
extension of A lying to the left of the lower bound of A can only consist 
of a finite number of eigenvalues, the sum of the multiplicities of which 
does not exceed p. 

We can assume without loss of generality that A is a positive 
operator. Let Z, be the spectral function of the self-conjugate extension 
A. Let us show that, given any 0 > £ >a, 


dim AS, H = (Eg —-8,)H <p. (174) 
Suppose the reverse inequality holds: 
dim 4%, H >p. (175) 


We know that 4%,2¢€ D(A) for se H and p = dim D(A) 
(mod D(A)). It follows from this that, by (175), there is a norma- 
lized element x € D(A) in the subspace 4%, H. But now, 


+20 g 
(Ax, x) = (Az,x)= f 242, x, x)= f Ad(F, 2,2) < <0, 


—oo 


which contradicts the fact that A is positive. Thus (174), and therefore 
the theorem, is proved. 


616 HILBERT SPACE [214 


213. Examples. 1. We considered in [188], in space H = L,(D), the operator A 
which is defined on all smooth finite functions in D and is the operator of 
differentiation of these functions: y 

D= (i 5 ə" g(x) 


m. 176 
Ly -oo OLR (176) 


We proved that A is a symmetric operator, having a bounded inverse on 
R(A). Hence it follows [209] that 4 admits of a self-conjugate extension A 
such that the equation 


Ago=y (p € D(A)) (177) 


is uniquely soluble for any (x) € Z,(D). The domain of definition of A is sup- 
plemented with this extension by the functions v(x) of D,(D) for whichA*A*y = 
= 0. These functions have generalized derivatives D* and D*D* and are found 
from the equations : 
a v(x) 
Di D* v= Oxf... OLR =e 
From these, we choose for D(A) those that are orthogonal to the solutions 
of the equation Dt u = 0. The operator A has the form D* on D(A), where 
D* is the generalized derivative (176). 
2. Let us consider the operator 
d? 
45 — a 


in space H = L,(0, +0), defined on all smooth functions, finite close to 
x = 0 and z = +œ. D(A*) is the set of all functions g(x) with the following 
properties: g(x) and g'(x) are absolutely continuous on any finite interval 
(0, a], p(x) and ø”(x) € £,(0, +). As pointed out, p’(x) € £,(0, +œ) also. 
For g(x) € D(A*), we have A*p(x) = —y”(x) [188]. D(A) consists of all the 
elements of D(A*) which satisfy the conditions (0) = 9’(0) = 0. 

The following statements are easily verified: (a) A is a positive operator; 
(b) the deficiency indices of A are (1, 1); (c) the continuous part of the spectral 
kernel coincides with the semi-axis 0 < 4 < + œ, and the same is true for 
any self-conjugate extension of A; (d) any symmetric extension of A is a self- 
conjugate extension. It is defined on all the elements of D(A*) that satisfy 
one of the two conditions »’(0) — hg(0) = 0, where h is a fixed real number, 
or ¢(0) = 0. The latter condition corresponds to a Friedrichs extension, and 
the operator stays positive and has a purely continuous spectrum, consisting 
of the semi-axis 0 < 4 < + 0.Ifh < 0, the self-conjugate operator correspond- 
ing to the condition y’(0) — Ag(0) = 0 has a purely continuous spectrum. When 
h > 0, it has one simple eigenvalue å = —h?. 


214, Infinite matrices. We discussed in [200] integral operators 
whose kernels satisfy conditions (107) and (108). We can consider 
similarly the operators in l: 


Yi = X Vix Xp (178) 
k=l 


(G1, Hoh, 


214] INFINITE MATRICES 617 


represented by matrices such that api = Qj, and 


co 


di = Y lai? = > tu? < 00 (179) 
i= í 


i=l 
(k=1,2,...; dp > 0). 


Series (178) are here absolutely convergent for any element x of l, 
but the series consisting of | y; |? is not necessarily convergent, i.e. 
(Yo Ya ---) may not be an element of l. Let D(A,) denote the lineal 
of x € l, such that 


Dd, |2,| < œ, 
ka 


and D(B) the lineal of z such that (y,, Y» ...) € l. It can be shown, 
as in [200], that D(4,) is everywhere dense in l, and that D(A,) S 
© D(B). On further writing A, for the operator defined in D(Ag) 
by (178), and B for the operator defined in D( B) by the same formulae, 
we can say that A, is a symmetric operator and that B = AQ [cf. 
200]. Notice that, by (179), all the base vectors belong to D(B) and 
even to D(A,). The necessary and sufficient condition for B to be 
self-conjugate is that, given any x and y € D(B), we have [cf. 200]: 


oo oo 


(> 4%) Hi = St Sti) (180) 
k= = 


i=l k=l 


where, since Qj, = agi: 


o 
Qir Yi = X Ui Yi 
[=l i=1 


i 


The symmetric operator A, may in fact be non-closed, and we can 
introduce the further new operator A, which is the closure of Ag, 
as we shall show. Let D(A) be the lineal of x € D(B) such that 


(Bx, y) = (x, By) 
(y € D(B)) (181) 


for any y of D(B), and A is the operator given by (178) on D(A). 
On the other hand, Aj* is defined on the lineal D(4%*) of elements 
x such that (By, x) = {y, 2*) for any y of D(B), and, since A9" C 
c Að, z* is given in terms of x by (178), i.e. 2* = Bx. On comparing 
this with the definition of A, it will be seen that Aj* coincides with 
A. But A}* is the closure of A), i.e. A is the closure of Ay, and A*= 
== Aj = B. We must mention one property of the lineal D(A) and of 


618 HILBERT SPACE [215 


the operator A. We shall describe as a “finite element” any element 
(Ti, %, ...) of l,, which has a finite number of components 2, dif- 
ferent from zero. Let D(A’) be the lineal of “finite elements” and 
A’ the operator (178), defined on this lineal. Since ar; = aj, this 
operator is symmetric, and any element of D(A’) obviously belongs 
to D(A,), i.e. A’ © A,; consequently, on writing A’ for the closure 
of A’, we have A’ E A, and hence A* G (A’)*. Let ex be the kth 
base vector. Any element y of (A’)* must satisfy the equation 
(Bex, y) = (ex y*), where y* = (A’)* y. On writing y; and y*; for 
the components of y and y*, this equation can be written as 

> Kyi = Yk ie Y= ai Yi» 

i= i= 
whence it is clear that y € D(A*) and y* = A* y. On comparing this 
result with A* S (A’)*, we see that (A’)* = A*, so that A** = A’. 
But we have A** = A, and hence A’ = A. This result can be stated 
as follows: 

THEOREM. If x € D(A), there exists a sequence $n of “finite elements” 
such that En => 2, Atn > Ax and A= A’. 

The fact that A is self-conjugate means that A* = A. If this is not 
the case, the deficiency indices of A are defined by the dimensionality 
of the subspaces formed by the solutions of the equations Ag = ix 
and Ax = —iz, and we can apply the above extension theory. Let 
us show further that, if the matrix aix is real and the complex element 
x’ + x” i belongs to D(A), then 2’ and x” also belong to D(A), so that 
x’ — x” i € D(A) also. For, by the theorem proved above, there exists 
a sequence &, = én + hi of “finite elements” such that || — 
— n l? = |] a’ — & IP + lla" — &|P+>0, and || Ax — Abn [P = 
= || Av’ — Aši ||? + || Av” — AE, ||? 0, whence it follows that 
lla’ — El +0, || Ae’ — AR >o, Me” — gnilo, || 42” — 
— A&E" ||» 0. In view of the fact that A = A’, we obtain our asser- 
tion. 


215. Jacobian matrices. Let us apply the above results to the 


Jacobian matrix: 
ab 0 0 0... 


bab 00... 
0 b, a, b, 0... b (182) 
0 0 b, ay by... 


215] JACOBIAN MATRICES 619 


where the a; are real and b; > 0. Condition (179) is obviously fulfilled. 
We start with k = 0 when enumerating the base vectors. 
We form the real polynomials P,(A) in accordance with [cf. 167]: 


AP (A) = By Pregs(A) + ar Plh) + bri Pe-a(A) (183) 
P_A)=0; P,(a)=1 


from which it follows that 
ex = P(A) 60, (184) 


where the base vectors have been denoted by ex. 
THEOREM 1. If the series 


= | Px(é) |? (185) 


is convergent, the operator A is non-self-conjugate. 

In view of the convergence of (185), we can form an element x 
of l, with the components x, = P,{7), so that (x, ex) = Pali). On noting 
that Ae, = br- €x—y + Ay ee + by Cx4,, and using (183), we get 


(Aep £) = br- Prali) + Oy Pylt) + by Potali) = Py) , 


whence we can write (Aep, x) = (ex, tz), since (ex, 2) = P,(i). Since 
A and the scalar product are distributive, we have (Ay, x) = (y, ix) 
for any ‘‘finite element” y and, by the theorem of [214], this equation 
holds for any y of D(A), so that x € D(A*) and A* x = iz, whence it 
follows that A is non-self-conjugate. 

THEOREM 2. If series (185) is divergent, A is self-conjugate. 

It is sufficient to show that A* does not have the eigenvalues +7. 
Let us suppose the opposite. Let A* x = iz, where the element 
X(XLo, is To ...) is non-zero. By the definition of A* and the fact that 
er € D(A), we have (Aer, £) = (ex, iz) or (x, Aeg) = i(T, ek) = tag, 
Le. (X, Oxy 6k- + ak Cx + by Cx41) = ify On expanding the scalar 
products, we get Dy, Ek- + An Xe + On Bkt = tz. On using (183) 
and the method of complete induction, we have x, = P,{i) %) and 
£o #0. But this contradicts the divergence of (185). If we replace 
i by (—2) in (185), another divergent series is evidently obtained, since 
P,{—i) = Pi), and, as above, A* does not have the eigenvalue 
(2); the theorem is proved. Thus the divergence of series (185) is 
necessary and sufficient for A to be self-conjugate. On repeating 





620 HILBERT SPACE [215 


word for word the proofs of the last two theorems, it can be shown 
that, if the series 


2 |Pa)]? (186) 

k=0 
is convergent for any given non-real a, then a and a are eigenvalues 
of A*, and if series (186) is divergent for some non-real a, then a 
and a are not eigenvalues of A*, whence it follows that A is self- 
conjugate, i.e. series (185) is also divergent. Conversely, if series 
(186) is convergent for some non-real a, A* has non-real eigenvalues 
and A is not self-conjugate; series (185) is also convergent. These 
remarks lead to the theorem: 

THEOREM 3. Only the following two cases are possible: series (186) 
is divergent for any non-real a or is convergent for any non-real a. In 
the first case A is a self-conjugate operator, whilst it is not self-conjugate 
in the second case. 

Further, it follows at once from the proof of Theorem 2 that, if (185) 
is convergent, the components of the eigenelements of A* corresponding 
to the eigenvalue 7 satisfy the equations 2, = P,(i) az, (k = 1, 2, ...), 
where x, is arbitrary and non-zero, i.e. the subspace M;(A) is one- 
dimensional. Similarly, M_,(A) is one-dimensional. It is evidently ob- 
tained from M;,(A) by replacing the elements x, by the conjugates. 
Thus the deficiency indices of A are (1, 1) in the second case. The ele- 
ments x; and x; of subspaces M;(A) and M_,(A) can be determined 
up to an arbitrary complex factor from 


Ti = S Pli) ex; Ti; = Pf 4) eks 
k=0 k=0 


and the elements of subspace D(A,) of the self-conjugate extension 
A, of A are defined uniquely by v = £4 + ax, where 24 € D(A), a 
is any complex number, x, = i(e~° v; + e°? 2) and 0 < 6 < 2x. 
Let &, be the spectra] function of A in the first case or of any given 
A, in the second case, and 0(A) = (Z, €v £o). We have, precisely as 
in [167]: 

+o 


i 0 for kl, 
| Pat) PA) A =F) gee ped 


=o 


foo + 


(depe) = J APA) PA) dolh), e= f Pyld) dF ye, 


-o0 —œ 


215] JACOBIAN MATRICES 621 


where A has to be replaced in the second case by A,. The elements of 
matrix (182) are obviously given by 


+00 


ai= J AP,(A) P(A) dolh). 


THEOREM 4. The polynomials P,(A4) form a closed system with 
respect to o(A). 

Let ¢,(4) be functions equal to unity for —œ < À < u and zero 
for 4 > u. Any function a(A), taking a finite number of values 
Qis QM, ...,Am, each value a, being taken on a finite interval, can 
evidently be written as a finite linear combination of functions 
y,(A) with different u. If, given any p, we can prove the closure equa- 
tion for g,(A), it must hold, by virtue of the generalized closure 
equation, for any linear combination of »,(4), and hence for all 
functions 2(A) of the type indicated. But the lineal of such functions 
is everywhere dense in L, with respect to (A) [60], so that the P;(A) 
form a closed system [60]. It is thus sufficient to prove the closure 
equation for 9,(A). Let us evaluate the integral of g/(A) and the 
Fourier coefficients of this function: 


+20 +00 


S RA) do(y = f do) = elu); a= J palh) P(A) dol) = 


= Os; {--, al 
= J Pà) dol). 
(—%, u] 


We have to show that, for any pu: 


elx) = = J, Z (A) do(A) =e Re ) de(A). 


k=0 (— 
In view of the closure equation, we have 
ele) = || 2, e0? = È (By ev» er) (Ce Fy 20) 
and it is sufficient to show that 
(Ep eor 61) = (ew By 0) = J PaA) del), 
=æ, 4# 
the right-hand side of which is real. But this last equation is a direct 


consequence of the earlier integral form of ex [cf. 192], and the theorem 
is proved. 


622 HILBERT SPACE (215 


We shall now give a simple sufficient criterion for matrix (182) 
to be self-conjugate. It follows at once from (183) that 


b, Zelo) Pala) — Pelo) Pala) _ 


a— a 





Py(a) Pya(@) = Pala) Prila) 


a—a 


= | Pala) P + br~ 





and, on summing over k from k = 0 to k = n — 1, we obtain 
n—l 7 a 
PA | P,(a) ? =0b,, Pala) Ppa) — Pala) Paila) 
k=0 a—a i 
and, in particular, with a = i: 
n-1 aes 2s 
> | Plô) | == bai Pali) P,-1(t) as P,,(¢) Prt) = 
k=0 2 
= bp- J [P,(t) Palt). 
Since P,(A) = 1, the left-hand side is >1, whence it follows that 





T < AAP ali) Pra] < | Poli) || Prat) < 
< ll Pali) P + | Posi) PD, 


and we get, on summing from n = 1 ton=m-+ 1: 


m m+1 
1 ; 
2y < we) Pals) l, 
n=0 On n=0 


whence we have, in view of Theorem 2: 

THEOREM 5. If the series formed from the 1b, is divergent, A is a 
self-conjugate operator. 

We shall mention without proof two facts directly connected with 
the above exposition. It can be shown that, if A has deficiency indices 
(1, 1), series (186) is convergent for any value of a. If A is self-con- 
jugate, (186) is divergent for all real a except those that correspond to 
the point spectrum of A, if this latter exists. Moreover, with deficiency 
indices (1, 1), every self-conjugate extension of A has a purely point 
spectrum (see N. I. Akhiezer, Infinite Jacobian Matrices And The 
Problem Of Moments (Beskonechnye matritsy Jakobi i problema 
momentov), (Uspekhi matem. nauk, t. IX, 1941). 


216] MATRICES AND OPERATORS 623 


The Hermitian polynomials may be taken as an example of a Jacobian matrix. 
We have defined these polynomials by the equation 
H, (à) = (— ljk oa É (o-a 187 
k (4) = (— 1)k o ggg (074) (187) 
and have had the relationship 


AH (A) = -p Heni (4) + RH g_1 (A) (188) 


and the integral equation [ITI,; 157}: 
tee = 
f e-# HÈ (A) då = 2Žk! Yr. (189) 


In order subsequently to obtain normalized polynomials, we introduce instead 
of (187) the polynomials 


— 1)k dk 
Paes 


s l" a2 
V2kk! Seager ope 


after which we can rewrite (188) as 
1.) = JEH Pa + Ee, 
where P,(A) = 1. Thus, if we take a Jacobian matrix by putting 
ap = 0; bg = yee (k=0, 1, 2,...), 


we in fact arrive at polynomials (190), where the relationships hold (cf. III; 
220): 


+o 
1 0 fork #1 
— fe-#P, (4) P, (à) di = ' 
y f mee r for k =, 
lf Of 
ye | oA A) P(A) oh = for | k—1] #1, 
Yu J = for 1=k 41. 


It follows from Theorem 6 that A is a self-conjugate operator in the present 
case. It can be shown by using the integral formulae written above that, for 


the operator A: 
A 


1 
e (4) = — ii eH du. 
yx 

A has a simple continuous spectrum, distributed over the entire interva} 
(— ©, + 0). 

216. Matrices and operators. Let us investigate the connection between 
matrices and symmetric operators in Hilbert space H. Suppose first that a 
bounded self-conjugate operator A is given in this space, and let p, 9, ... 


624 HILBERT SPACE [216 


be any complete orthonormal system of elements of H. If the formula 
ane = (Abe Pn) = (Po Pa) agr 
(Qgn = Gng) 


defines the elements of some matrix a, we have 
tn = S Ane (192) 
k=1 


where the 2, are the components of the element x, i.e. 2, = (2, Ẹk), and zg 
are the components of the transformed element, i.e. x, = (Ax, pg). Thus, given 
a definite choice of base vectors, the operator A is given by the matrix of 
(192). If we choose another system of base vectors y,, y,,..., and U is the 
unitary transformation such that Ug, = y, (k = 1,2,...), where ty, = 
= (Upp Pp) = (Yp Pp), the operator A will correspond to the matrix with 
slements 


bne = (AV ys Pn) = X (Ave Ps) Wa Pe) = X (Po APs) (Pm P) = 


s=1 s=1 


= S Wn?) S Vo p) (APs P) (193) 
s=1 t=1 


where we have made use of the generalized closure equation (18,) of [121]. 
On using the notation introduced above, we can write 


bnk = Y Usn X) ast (194) 
s=1 t=1 


If we apply this formula to by, = bng, then pass to the conjugates and replace 
the letter s by ¢ and t by & in the right-hand side, we obtain 


b= X Un X Ysta (195) 
t=1 s=l 
Similarly, 
Ong = X Uns X Deptt gt = > Un D> Unsdst- (196) 
s=1 t=1 t=1 s=l 


Conversely, if a {any} (akn = ank) is a given matrix, satisfying the condition 
for boundedness [163], and the system of base vectors pp (k = 1,2,...) is 
fixed in H, formulae (192) define a bounded operator A, self-conjugate in H. 
The kth column of the matrix {ang} gives the components of the transformed 
element obtained from the base vector pg, so that we can write: 


oo 


Ap; = > OnkPn- 


n=1 


The connection between matrices and unbounded operators is more complicated. 


216) MATRICES AND OPERATORS 625 


We shall in future describe as C-matrices those that satisfy the symmetry 
condition (ayn = ang) and condition (179). Let F be a closed symmetric operator 
with the lineal D(F}, dense in H. We take a complete orthonormal 
system of elements px such that all the py belong to D(F), and we define the 
elements of the matrix {apx} by (191) with A replaced by F. In view of the 
symmetry of F and the closure equation, we can say that {ang} is a C-matrix. 
On applying the generalized closure equation to the scalar product (Fx, n) = 
= (x, Fon), on the assumption that x € D(F), we obtain 


co 


En = (Fx, pa) = PA £, Pk) (EPn Pr) = J; Onk Ze (197) 


i.e. F is given by (192) in the base vectors px. The lineal D(F) obviously contains 
all “finite elements”, i.e. all finite linear combinations of base vectors, and, 
since the operator A defined in [214] on the basis of the matrix {a,,} is the 
closure of the operator A’ defined by (192) on the lineal of “finite elements”, 
we can say that A c F; consequently, F* c A", i.e. F* is also given by (192) 
on the corresponding lineal D(F*). If F is an extension of A, by Theorem 1 
of [186], D(#*) is only part of the D(B) that we defined in [214]. In the present 
case the same matrix yields different operators. 

If, instead of y,, we take another system of base vectors y, which also belong 
to D(F), the operator F is given by the matrix {bng}, where, as above, (192) 
and (197) hold, with apg replaced by b,,. Now suppose that a C-matrix {a,,} 
is given, instead of the operator. We take an arbitrary system of base vectors 
Pk of H and define the operator A’ by (192) or (197) on the lineal D(A’) of 
‘finite elements”. The closure of A’ leads to some closed symmetric operator 
A. We shall say that A is generated by the matrix {a,,} and the system of base 
vectors py, and we write A ~a,,{y,}. As a matter of fact, any symmetric 
closed operator in H can be obtained in this way. 

THEOREM. Any symmetric closed operator A with D(A) dense in H can be 
generated by some C-matrix and system of base vectors px. 

It is sufficient to form the corresponding base vectors pg of D(A). The matrix 
is given by (191). These base vectors gy, must possess the following property: 
given any x of D(A), there exists a sequence wp of ‘‘finite elements” such that. 
On=> x and Aw,=> Ax. To obtain such gg, it is sufficient to form a sequence 
w, of D(A) such that, given any x of D(A), there exists a subsequence wp, On,» -> 
such that w,,=> x and Awp, => Ax. Orthogonalization of the œ, obviously 
leads to the øp, where it follows from wp, => x and the fact that D(A) is dense 
in H that the sequence w, is dense in H, so that the system of base vectors 
Pr is complete. Let us turn to the formation of the w,. We take some sequence 
Xio Xg --+ Of elements dense in H. Let p, q, r be any positive integer triple. 
If at any rate one element x of D(A) exists, such that || xp — «|| < I/r and 
ll xy — Ax || < l/r, we associate one of these x with the above triple, and 
write xp q,r- These elements can [1] obviously be enumerated; let us show that. 
they have the properties required of w,. Let v € D(A) and e be any given positive 
number. We choose r satisfying 1/r < «/2, and elements yp and y, such that 
ll x» — || < 1/r and || y — Ax || < l/r, which is possible, since x, are dense 


in H. There now exists an element x, q, such that || zp — £p, q,r || < 1fr 


626 HILBERT SPACE [217 


and || xg — Azp 4, 


|| = —p,q,r {I< [I Xp — T Il + Il Xp — “p,q,r IE 
|| Ax — Aapar |S Il xg — Av I| + Il Xg — Ax, Il 


and that l/r < ¢/2, we get: || — £p grl < e and || Ax — Azp q, rll < ©. 
Hence it follows, since e is arbitrary, that £p » , possess the properties used 
above to characterize the sequence w,; the theorem is proved. 

The same closed symmetric operator can be generated by different matrices 
and base vectors. If A ~ a;,{p,} and A ~ b,,{y;,}, and if we bring in the unitary 
operator U described above, and put up, = (Yp pp), we get (194), (195) and 
(196). Notice also that, if F is a given symmetric closed operator, the system 
of base vectors gą belongs to D(F), {ang} is the matrix defined by (191) with 
A replaced by F, and A ~ a,,;{g,}, then F either coincides with A or is an 
extension of A, as we have seen above. 


+ || < l/r, whence, on observing that 


217. The unitary equivalence of C-matrices. We had in the last section two 
systems an, {~,} and bn, {py}, generating the same operator A. The up, were the 
elements of the matrix corresponding to the unitary transformation Ug, = Yp 
in the base vectors pp. The inner sum of (194) is the scalar product (Ay, Ps), 
and we can say, in view of the closure equation, that the square of the modulus 
of this sum forms a convergent series on summing over s. The inner sum in 
(195) is obtained from the inner sum of (194) by passing to the conjugates, 
interchanging s and ¢ and replacing k by n. What has been said on summing 
over ¢ is also true in thus case, i.e. 


2 Usn Ost 


oo 2 oo 
PA astr | <; PAR <9. (198) 


s=1 














This naturally leads to the following definition: 

DEFINITION 1. The unitary matrix {up} i8 said to be applicable to the C- 
matrix {ank}, if condition (198) is satisfied and if the repeated sums (194) and 
(195) lead to the same result. The resulting matrix {bp} is called the transformed 
matriz. 

It follows from the fact that {a,,} is a C-matrix and {up} a unitary matrix 
that, by Cauchy’s inequality, the inner series of (194) and (195) are absolutely 
convergent, whilst (198) implies the convergence of the outer series also. In 
view of what has been said about the inner sums, one of conditions (198) implies 
the other condition. It follows at once from (194) and (195) that bygn = brio 
and, in view of (198) and the fact that {up} is a unitary matrix, the sum over 
k of the terms | bng |? is finite, i.e. {bn} is a C-matrix. Let us prove the following 
theorem: 

THEOREM l. If the unitary matrix U is applicable to {angą}, the inverse matrix 
U-* is applicable to {bpp}, and the transformed matrix is {anp}. 

We must show that 


2, Upn Ong = A, Qpt Utk 


7 


nk Ugk = È Ys Usn Asg + (199) 


o 
1) 
m 


217] THE UNITARY EQUIVALENCE OF C-MATRIOES 627 


It is enough to prove the first. The proof of the second is similar. We have: 


È Upy bry = È LX (X agu) ty] upnr- 
n=l n=l s=l t=1 
By (198), the sum in the curved brackets can be regarded as composing the 
&, of some element £ € l}. On writing y for the unitary operator in 1, realized 
by the matrix {up}, we can write the right-hand side of our last equation as 
n 


OT? En tion = (AY * 8p = Ep = = Op Utk» (200) 


iM: 


whence the first of (199) follows immediately. A direct consequence of (199) 
is that 


+ 


Ma 


2 > Upn Ong |? < œ; PA bnk Ug? < œ. (201) 
k=1 n=l k=1 


n=l 


For, on writing 1 = app we have an element 7 of l, with components ny, 
and the right-hand side of the first of (199) can be written as (y~'7),, whence 
the first of (201) follows at once. The proof of the second is similar. On introdu- 
cing the further element 7’ of l, with components n, = bnp, we can write the first 
of (199) as (y7’)p = (Y7 n) whence (y~!)x = (y7’)p. On multiplying by ug 
and summing over k, we obtain 


pq = 2, ( PA Upn bnk) “gk » 


i.o. one of formulae (196). The second formula is proved similarly; the theorem is 
proved. 

We shall now give a definition of the unitary equivalence of two systems 
Ang {Px} and by, {y,}, each of which contains a complete system of base vectors. 

DEFINITION 2. Two systems Qpx {py} and bry {y,} are said to be unitary equi- 
valenta if the unitary matrix U with elements upg = (Yp Pp) 18 applicable to {a,x} 
and leads to the transformed matrix {bn}. 

If the conditions of the definition are satisfied, it follows from the last theorem 
that the unitary inverse to the matrix U with elements {up}, i.o. the matrix 
with elements Ube = Ugp = (Pp Wp), is applicable to {b,,} and leads to the 
transformed matrix {a,,,}, i.e. the unitary equivalence of two systems is a recip- 
rocal property. Notice also that Ug, = y, and U~! yk = pp. The following 
is the fundamental theorem on equivalent systems. 

THEOREM 2. The necessary and sufficient condition for two systems any {pz} 
and bnk{yk} to be unitary equivalents is that the closed symmetric operators A 
and A, generated by them have the same symmetric operator F as their extensions. 

Let us first prove the necessity. Let the systems be unitary equivalents. 
We have: 


(Ag, ¥q) = È (49p Po) (Po Y) = 2, asp Ugg, 
s= = 


(Pp 4, Yq) = 2, (Pss A, Yq) (Pp Ws) = a bys Ups» 
= s= 


628 HILBERT SPACE [217 


and the second of (199) shows that (App, Y4) = (Pp, A, Y4). The same equation 
obviously holds for finite linear combinations of ¢, and y,. On using the defini- 
tions of D(A) and D(A,) and passing to the limit in the scalar products, we 
obtain 


(4x, y) = (z, A41 y), if w€D(A) and y€D(A,). (202) 


If x belongs simultaneously to D(A) and D(A,), in addition to (202) we have 
(A, 2, y) = (x, A,y), and, by (202), (Ax — A,x, y) = 0 for any y € D(A,), 
and, since D(A,) is dense in H, we have Ax = A, 7x. 

Let D(F) be the lineal of elements x expressible as x = x’ + y’, where 2’ 
€ D(A) and y’ € D(A,), and let us put Fx = Av’ + A,y’. If we have the 
two forms: x = 2’ + y’ = x” + y”, where x’ and x” € D(A), and y’ and y” 
€ D(A,), it follows from g’ — x” = y” — y’ that x’ — x” and y” — y’ belong 
simultaneously to D(A) and D(A,), and, by what has been said, Ax’ — Ax” = 
= Á, y” —A,y’, io. Ax’ + A,y’ = Ax” + A, y”, whence it follows that the 
definition of Fz is unique. By (202) and the symmetry of A and A,, F is a 
symmetric operator. It is evidently an extension of A and A,, and the necessity 
is proved. 

Now suppose that the symmetric operator F is an extension of A and A. 
We have 


2, Opi Utk = PA (AM Pp) (Yr P) = PA (Pe Pi) (APp, Pi) = 


= (Ve APp) = (Vh FPp) = (Fp Pp) = (Ar Ye Pp) = 
= > (Ai Yi Yn) (Yw Pp) = a U pn bnk» 
n=1 n=l 


i.e. we have obtained the first of formulae (199). The second can be obtained 
similarly. On repeating the proof of Theorem 1, it may be seen that {up} is 
applicable to {ank}, and (194), (195) and (196) hold; the unitary equivalence 
of the systems is thus proved. 

It follows from the theorem that the unitary equivalence of two systems 
is completely determined by the closed symmetric operators generated by them, 
and the concept of unitary equivalence can be carried over in a natural way 
from systems to the closed symmetric operators generated by such systems. An 
immediate consequence of this is that the only unitary equivalent of a bounded 
operator is itself, and of a maximal operator part of itself. Unitary equivalence 
is a reciprocal property, but is not transitive, i.e. if an operator C, is the unitary 
equivalent of C, and C, the unitary equivalent of C}, it does not follow that C, 
is the unitary equivalent of C}. This will obviously be the case if O, is a maximal 
operator, since C, and C, here have the common extension C,. The general theory 
of C-matrices is substantially different from the theory of matrices correspond- 
ing to bounded self-conjugate operators. The theory of C-matrices is expounded 
in J. Neumann’s Zur Theorie der unbeschrankten Matrizen (Crelle, Journal, 
Bd. 161, 1929) and in Wintner’s book Spektraltheorie der unendlichen Matrizen 
(Leipzig, 1929). 


218] THE EXISTENCE OF THE SPECTRAL FUNCTION 629 


218. The existence of the spectral function. Let us now turn to the proof 
of the fundamental theorem of [192], according to which, given any self-conjugate 
operator A, there exists a resolution of the identity &, such that A is expressible 
by the Stieltjes integral (58) of [192]. We can make use here of the properties 
of A that have been deduced without the aid of (58). We proved without 
the aid of this formula in [189] that, given any non-real 4, there exists a 
bounded operator (A — AE)~}, defined throughout H, such that the formula 
x = (A — AEF) u maps D(A) one-to-one onto H. Let us take the cases A = +i 
and put 

x=(A—iE)u, y=(A+7E)v (u, v€ D(A)); 
and 


usx(A—iE)'2, v=(A+iE) ly, (x,y €H). 
We have, since A is self-conjugate: 
[(A —iH)~*}* = (A + iE)™. 


On introducing the bounded self-conjugate operatars 


0 = = (4 — iE)! + (4 +iE)]; B= E [(4 — iE)? — (A + iE)™*], 
(203) 
we obtain 
(A — iE) =0 +iB, (A +3i8y*=0 — iB, (204) 


where the elements Cx and Bx belong to D(A) for any choice of x € H. It fol- 
lows from (204) that 
(A — iE) (0 + iB) = (40 + B) + i(AB — 0) = E, 
(A + i£) (0 — iB) = (AC + B) — i(AB — 0) = E, 
whence 
AC = E — B; (205) AB =C. (206) 
If the element x belongs to D(A), we can also write 
(C + iB) (A —iE)r=g and (O —iB)(A+i#)r=2a, 
whence we obtain, on removing the brackets and comparing with the above: 
AOx =0CAz, ABr + BAz, («€ D(A)), (207) 


i.e. the bounded operators B and C commute with A in the sense of the defini- 
tion of [191]. 
It follows from BO = BAB = ABB =CB that 


CB = BO, (208) 
i.e. B and C commute with each other. On using (205) and (206), we get B = 
= BE = B(B 4- AC) = B?+ BAC = B + ABCO = B -4+ 0, ie. B is a 
positive operator. Further, it follows from 
lizel? = ||(A — iB)ul? = ((A — iE) u, (A — iE) u) = ||Aull? + Ilu], 
fiyl? = || 4l? + llel]? 


that || wu || < ||z|| and || || < || yl], ie. the norms of the operators (A — 
— iE)! and (A + ¿Ey ! do not exceed unity, so that, by (203), the same can 


630 HILBERT SPACE [218 


be said for B and C. Let us also show that Bx = 0 implies æ = 0. In fact, 
if Be = 0, then (Bz, x) = (B? x, x) + (C? x, x) = 0, i.e. (Bx, Bx) + (Oz, Cx) = 0 
or || Bz ||? + || Oz ||? = 0, whence it follows that Cs = 0. Now, (205) gives 
x = Be + ACz = 0. In addition to the properties of B and C given above, 
we shall require a further lemma, the proof of which is given below. 

Lemma. Let M, (n = 1, 2, ...) be mutually orthogonal subspaces, the orthogonal 
sum of which gives the whole of H. Further, let a self-conjugate bounded operator 
An be defined in each Mp. There now exists a unique self-conjugate operator A 
in H, coinciding with A, in M,. The lineal D(A) consists of the elements x for 
which the series formed from || An £n ||? 18 convergent, where x, has been written 
for the projection of x onto M,,, and we have for these x: 


At = An 2y (209) 
n 


Let us return to the operators B and C. The spectrum of B lies in the segment 
[0, 1], and, since Bx = 0 implies x = 0, the point 4 = 0 does not belong to 
the point spectrum, so that, if we write ë} for the spectral function of B, we 
have č; = 0. On writing M, for the subspace onto which the operator (ĉin — 
— Ein +) projects, we can say that the M, are mutually orthogonal, and 
that their orthogonal sum is equal to H. If the bounded self-conjugate operator 
F commutes with ë}, by the theorem of [148], M, reduces F. This will hold 
for C and any real continuous function of B [193]. Let g,(A) = 1/A for 1/(n + 
+1)< å< 1/n, and be equal to a constant outside this interval, whilst the 
continuity is preserved at the ends; let us introduce the self-conjugate bounded 
operator 9,(B).Ifz € Mp, then é} z = z for å > l/n andé,z = O for À < Vin + 
-}- 1). This follows at once from 
Ehz = éi (čie )z 


n n+l 


and the fact that 8; €14 =ë; 6, =), for u< å. If we express y,(B)z and Bz by 
Stieltjes integrals in terms of č; and make use of the definition of p„(å) and what 
has just been said regarding ĉ}z, we find that p„(B) B = Bp, (B) =E in 
M,, i.e. B and ¢,(B) are inverses in M,. If z € Mn, we can write z = Bo,(B) z, 
whence it is clear that z € Mn implies z € D(A). On using (206), we can write: 
Az = ABp,(B)z = Cy,(B) z, whence it is clear that A is a bounded operator 
in M,. On also observing that C and 9,(B) commute and that M, reduces C 
and 9,(B), we can assert that A is bounded and self-conjugate in M,,. Let 
EC denote the resolution of the identity corresponding to this operator in Mp. 
We can form on the basis of the lemma, the self-conjugate operator ¢, in H: 


ls = > Pr, (210) 
n=l 
and it may easily be verified that ë} is a resolution of the identity in H. If we 
form the Stieltjes integral, which defines a self-conjugate operator, 
+e 
f Ade, 2, 


218] THE EXISTENCE OF THE SPECTRAL FUNCTION 631 


it will be seen, on noting that é, x = 8) x if æ € Mp, that it yields A if x € M,, 
and hence, by the lemma, it defines the operator A. It remains to prove the 


lemma stated above. 
Let 2 belong to the lineal D(A) of the elements for which 


> [lán tal? < œ 5 


we define an operator A by the formula Ax = A,2,+ 4z + ... Let us 
show that A is self-conjugate. The lineal D(A) clearly contains finite sums of 
elements æn, so that D(A) is dense in H. The symmetry of A follows from the 
fact that A, is self-conjugate and 


(Az, y) = (2 An Tp a Yn) = z (An 2m Yn) = > (Ep An Yn) = 


= (2 Tr > AnYn) = (x, Ay), 


where z and y € D(A) and use has been made of the continuity and distributive- 
ness of the scalar product, as also of the orthogonality of the subspaces Mp. 
Thus A* >A, and we have to show that, if x € D(A*), then x € D(A). On 
observing that A? x, belongs to Mn and that x — x, 1 Mn, we can write 


(A* (x — En), ALn) = (2 — Ep, A? £p) = 0, 
whence, by Pythagoras’ theorem, when n = 1, 
|| A* 2]? = ||A* (æ — s) ||? + ||Aa, |). 
We can similarly write 
: ||A* (a — 2) ||? = ||A* (@ — x, — z) ||? + ||Aa, ||?, 
i.e. 
||A* a||? = ||A* (x — xı — ae) ||? + || Ax, ||? + [Ace ||, 
and in general 


n 
|A* ol? = ||A* (x — zı — £z — ... — Lp) ||? + 2 4z, ||’, 
kæ 
whence it follows that 


Š ll Anzali < At alf, 


so that x € D(A); we have proved that A is self-conjugate. It remains to show 
that there exists a unique self-conjugate operator coinciding with A, on Mp 
Suppose that, in addition to the A obtained, there exists a further operator 
A’. Since A’ is self-conjugate, it must be closed. We have for finite sums: 


m m m 
A’ > my=A Dy Tg = PA Aay, 

k=l k=1 k=l 
since A and A’ coincide on M,. In view of the fact that A’ is closed, we can say 
that A’ is defined on D(A) and coincides on this lineal with A, i.e. A’ DA. 
On the other hand, on replacing A* by A’ in the proof given above, we can con- 
clude that, if x € D(A’), then x € D(A), so that A’ coincides with A; the 
lemma is proved. 


INDEX 


Bessel’s inequality 160, 367 


Class B of functions 126 
Commutation of operators 379, 469, 
475, 483, 558, 575 
Compactness 
criteria for 274 
examples of 273—277 
in metric space 271 
weak, in Hilbert space 398 
Comparison of semi-bounded opera- 
tors 602 
Completeness 
of space 258, 283, 369 
weak, of regular space 302 
Completion of metric space 259 
Convergence 
in linear normed space 283 
in mean 154 
in metric space 257 
of operators 311 
weak 
in C 309 
of elements 298 
of functionals 295 
in Hilbert space 398 
in 1, and Lp 308, 309 
of operators 312 


Defect indices 590 

Differential solutions of equation 
457, 502, 516, 574 

Differentiation of functionals on QO 
360 


Eigenvalues 389, 571 
of self-conjugate operator 391, 441, 
456 


Equation(s) 
closure 161, 371 
generalized 163, 372 
operator 315, 405 
Equivalence of functions 115 
Equivalent norms in spaces W(P 342 
Extension 
of functionals 290 
of operators in continuity 287 
of semi-bounded operators 597 
of symmetric operators 586 
Exterior measure 92 


Field of sets, B 106 
Fourier’s transformation 520, 523 
Fubini’s theorem 188 
Functionals 
bilinear and quadratic 381 
on compact sets 273 
linear 
in C 43, 302 
in CO 356 
of function type 357 
in Hilbert space 376 
in lp 308 
in Lp 302 
in normed space 265 
Functions 
absolutely continuous 216, 230 
jump 18 
set 
absolutely continuous?15, 221, 243 
additive 209 
singular 213 
systems of, orthogonal 160 
closed and complete 162 
of variation, bounded 24 
on plane 66 
of several variables 69 


634 INDEX 


Fundamental sequence 259 Jacobian 499, 618 
projection 491 
self-conjugate 492, 500 


Generalized derivatives 321 unitary 490, 491 
Graphi ob operator: bag Mean functions 201, 322 
Measurability 
Hermitian functions 524 of function 113 
of set 95 
criteria for 105 
Inequality 
Bessel’s 160, 367 
Friedrich’s 355 Norm 
Hilbert’s 606 absolute, of operator 419 
Hélder and Minkovskii’s 173 equivalent 330 
Poincare’s 355 of bounded operator 286, 378 
Integral of element, normed space 282 
Cauchy-Stieltjes 77 of functional 289, 376, 382 


inversion formula for 78 
form of function of W, Sobolev’s 


s40 Operators 
i bounded 286, 485, 510 
Fourier—Stieltjos 70 closed 533 
convolution theorem for 75 conjugate 313, 380, 484, 510, 535, 
inversion formula for 73 593 
Hellinger 247, 253 continuous 286 
Lebesgue 129 completely 314, 317-320, 400, 
-Stieltjes 128, 134, 178, 185 410. 508. 512 
passage to limit under integral distributive 286 
one sign 147 in Hilbert space 288 
Stieltjes 4, 64 hypermaximal 592 
of continuous function 13 integral 509, 583 
existence 32, 56 inverse 288 l 385 
general 7 irreducible 452 
rin gi a , isometric 418, 583, 584 
physical interpretation of 22 linear 288, 378 


Interval, functions of 48 maximal 592 


in metric space 265 
Kernels, dependent on difference 527 of multiplication by independent 
variable 460, 525, 579 
normal 473 


Lineal normally soluble 537 
in B space 283 in normed space 285 
in L, 168 positive 384, 541 

Linear envelope 284 definite 541 
independence of elements in B space projection 378, 423 

282 self-conjugate 380, 542 
semi-bounded 540 
Matrices symmetric 540 


C 625 unbounded 533, 543 


INDEX 


unitarily equivalent 417 
unitary 415, 517 

Orthogonality of elements, Hilbert 
space 369 

Orthogonalization of elements 371 


Perturbation of spectrum 471 

Principle 
of compressed mappings 282 
selection 50 

Product, interior 312 

Projection of element, Hilbert space 
373 


Reducibility of operator 560 
Regular point of operator 391, 
606 
Representation of functional, Hilbert 
space 376 
Resolution 
of identity 428, 562 
spectral 
of self-conjugate operators 434, 
562 
of unitary operators 464 
Resolvent 391, 396, 438, 569 
matrices of 494 


653, 


Scalar product 367 
Separability 280 
Series, power, of operator 397, 479, 480 
Sets 81 
B 110 
closed 86 
measurable 97 
open 86 
Space(s) 
C 40, 70, 267 
oO 340, 356 
CO 356 
conjugate 293 
Hilbert 367 
isometric 259, 284 
K 366 
l and L, 165, 173, 182, 266, 483, 508 
I, and L, 168, 173, 182, 266, 509 


linear normed 281 
of linear operators 311 
metric 257 
regular 294 
En m, 8, M, S, V 266—268 
UO 356 
W® 329, 332 
WY 329, 332 
WY? 337 
Spectral function 433, 480, 568 
of integral operator in L, 514, 515 
Spectrum 
condensation points of 456, 471, 
472 
continuous 444, 453 
kernel of 608 
limiting 456 
mixed 455, 572 
of operator 389 
point 456 
purely 
continuous 444, 572 
point 443, 572 
of self-conjugate operator 607 
simple 498 
continuous 450 
small perturbations of 577 
Subdivisions 9 
continuations of 10 
product of 10 
Subspaces 
invariant 450, 558 
of linear space 284 
mutually complementary 376 
operations of 421 
Sums 
Lebesgue 129 
Riemann-Stieltjes 5 
Stieltjes-Darboux 9 


Theorems, embedding 357 


Unitary equivalence 
of matrices 501 
of self-conjugate operators 463 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 
Vol. 
Vol. 


Vol. 
Vol. 


Vol. 


ol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Sr aan aE o a 


~A & 


22. 
23. 
24. 


25. 
26. 
27. 
28. 
29. 
30. 
31. 
32. 


33. 


VOLUMES PUBLISHED IN THIS SERIES 


WALLACE— Introduction to Algebraic Topology 

PEDOE— Cireles 

SPAIN — Analytical Conics 

MIKHLIN — Integral Equations 

EGGLESTON — Problems in Euclidean Space: Application of 
Convexity 

WALLACE— Homology Theory on Algebraic Varieties 


NOBLE— Methods Based on the Wiener-Hop{ Technique for the 
Solution of Partial Differential Equations 


MIKUSINSKI-— Operational Calculus 

HEINE—Group Theory in Quantum Mechanics 
BLAND—The Theory of Linear Viscoelasticity 

KURTH— Aziomatics of Classical Statistical Mechanics 
FUCHS — Abelian Groups 

KURATOWSKI- Introduction to Set Theory and Topology 
SPAIN — Analytical Quadrics 


HARTMAN and MIKUSINSKI—Theory of Lebesgue Measure and 
Integration 


KULCZYCKI— Non-Euclidean Geometry 

KURATOWSKI-— Introduction to Calculus 

GERONIMUS— Polynomials Orthogonal on a Circle and Interval 
ELSGOLC—Calculus of Variations 

ALEXITS—Convergence Problems of Orthogonal Series 


FUCHS and LEVIN— Functions of a Complex Variable, 
Volume II 


GOODSTEIN — Fundamental Concepts of Mathematics 
KEENE~— Abstract Sets and Finite Ordinals 


DITKIN and PRUDNIKOV — Operational Calculus in Two Variables 
and its Applications 


VEKUA-— Generalized Analytic Functions 

AMIR-MOEZ and FASS— Elements of Linear Spaces 
GRADSHTEIN — Direct and Converse Theorems 
FUCHS — Partially Ordered Algebraic Systems 
POSTNIKOV — Foundations of Galois Theory 
BERMANT—Course of Mathematical Analysis, Part II 
LUKASIEWICZ— Elements of Mathematical Logic 


VULIKH— Introduction to Functional Analysis for Scientists and 
Technologists 


PEDOE— Introduction to Projective Geometry 
637 


638 


Vol. 
Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 


Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


34. 
35. 
36. 


37. 
38. 
39. 
40. 
41. 
42. 
43. 
44. 


46. 


47. 
48. 
49. 
50. 
51. 


52. 


53. 
54. 


55. 
56. 
57. 
58. 
59. 
60. 
61. 


VOLUMES PUBLISHED IN THIS SERIES 


TIMAN—Theory of Approximation of Functions of a Real Variable 
CSASZAR— Foundations of General Topology 


BRONSHTEIN and SEMENDYAYEV—A Guide Book to Mathe- 
matics for Technologists and Engineers 


MOSTOWSKI and STARK — Introduction to Higher Algebra 
GODDARD— Mathematical Techniques of Operational Research 
TIKHONOV and SAMARSKII— Equations of Mathematical Physics 
McLEOD— Introduction to Fluid Dynamics 

MOISIL—The Algebraic Theory of Switching Circuits 

OTTO — Nomography 

RANKIN — An Introduction to Mathematical Analysis 
BERMANT— A Course of Mathematical Analysis, Part I 
KRASNOSEL’SKII — Topological Methods in the Theory of Non- 
linear Integral Equations 

KANTOROVICH and AKILOV— Funcetiona Analysis in Normed 
Spaces 

JONES—The Theory of Electromagnetism 

FEJES TOTH— Regular Figures 

YANO — Differential Geometry on Complex and Almost Complex Spaces 
MIKHLIN — Variational Methods in Mathematical Physics 


FUCHS and SHABAT— Functions of a Complex Variable and Some 
of their Applications, Vol. I 


BUDAK, SAMARSKII and TIKHONOV— A Collections of Problems 
on Mathematical Physics 


GILES— Mathematical Foundations of Thermodynamics 

SAUL’ YEV~—Integration of Equations of Parabolic Type by the 
Method of Nets 

PONTRYAGIN et. al.—The Mathematical Theory of Optimal Processes 
SOBOLEV— Partial Differential Equations of Mathematical Physics 
SMIRNOV—A Course of Higher Mathematics, Vol. I 
SMIRNOV—A Course of Higher Mathematics, Vol. II 
SMIRNOV—A Course of Higher Mathematics, Vol. [11/1 
SMIRNOV—A Course of Higher Mathematics, Vol. III/z 
SMIRNOV—A Course of Higher Mathematics, Vol. IV 


